1. 27 Oct, 2016 6 commits
  2. 26 Oct, 2016 15 commits
  3. 23 Oct, 2016 11 commits
  4. 22 Oct, 2016 7 commits
    • David S. Miller's avatar
      Merge branch 'bpf-numa-id' · 67dc1596
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      Add BPF numa id helper
      
      This patch set adds a helper for retrieving current numa node
      id and a test case for SO_REUSEPORT.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      67dc1596
    • Daniel Borkmann's avatar
      reuseport, bpf: add test case for bpf_get_numa_node_id · 3c2c3c16
      Daniel Borkmann authored
      The test case is very similar to reuseport_bpf_cpu, only that here
      we select socket members based on current numa node id.
      
        # numactl -H
        available: 2 nodes (0-1)
        node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17
        node 0 size: 128867 MB
        node 0 free: 120080 MB
        node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23
        node 1 size: 96765 MB
        node 1 free: 87504 MB
        node distances:
        node   0   1
          0:  10  20
          1:  20  10
      
        # ./reuseport_bpf_numa
        ---- IPv4 UDP ----
        send node 0, receive socket 0
        send node 1, receive socket 1
        send node 1, receive socket 1
        send node 0, receive socket 0
        ---- IPv6 UDP ----
        send node 0, receive socket 0
        send node 1, receive socket 1
        send node 1, receive socket 1
        send node 0, receive socket 0
        ---- IPv4 TCP ----
        send node 0, receive socket 0
        send node 1, receive socket 1
        send node 1, receive socket 1
        send node 0, receive socket 0
        ---- IPv6 TCP ----
        send node 0, receive socket 0
        send node 1, receive socket 1
        send node 1, receive socket 1
        send node 0, receive socket 0
        SUCCESS
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c2c3c16
    • Daniel Borkmann's avatar
      bpf: add helper for retrieving current numa node id · 2d0e30c3
      Daniel Borkmann authored
      Use case is mainly for soreuseport to select sockets for the local
      numa node, but since generic, lets also add this for other networking
      and tracing program types.
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2d0e30c3
    • David S. Miller's avatar
      Merge branch 'udpmem' · a10b91b8
      David S. Miller authored
      Paolo Abeni says:
      
      ====================
      udp: refactor memory accounting
      
      This patch series refactor the udp memory accounting, replacing the
      generic implementation with a custom one, in order to remove the needs for
      locking the socket on the enqueue and dequeue operations. The socket backlog
      usage is dropped, as well.
      
      The first patch factor out pieces of some queue and memory management
      socket helpers, so that they can later be used by the udp memory accounting
      functions.
      The second patch adds the memory account helpers, without using them.
      The third patch replacse the old rx memory accounting path for udp over ipv4 and
      udp over ipv6. In kernel UDP users are updated, as well.
      
      The memory accounting schema is described in detail in the individual patch
      commit message.
      
      The performance gain depends on the specific scenario; with few flows (and
      little contention in the original code) the differences are in the noise range,
      while with several flows contending the same socket, the measured speed-up
      is relevant (e.g. even over 100% in case of extreme contention)
      
      Many thanks to Eric Dumazet for the reiterated reviews and suggestions.
      
      v5 -> v6:
       - do not orphan the skb on enqueue, skb_steal_sock() already did
         the work for us
      
      v4 -> v5:
       - use the receive queue spin lock to protect the memory accounting
       - several minor clean-up
      
      v3 -> v4:
       - simplified the locking schema, always use a plain spinlock
      
      v2 -> v3:
       - do not set the now unsed backlog_rcv callback
      
      v1 -> v2:
       - changed slighly the memory accounting schema, we now perform lazy reclaim
       - fixed forward_alloc updating issue
       - fixed memory counter integer overflows
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a10b91b8
    • Paolo Abeni's avatar
      udp: use it's own memory accounting schema · 850cbadd
      Paolo Abeni authored
      Completely avoid default sock memory accounting and replace it
      with udp-specific accounting.
      
      Since the new memory accounting model encapsulates completely
      the required locking, remove the socket lock on both enqueue and
      dequeue, and avoid using the backlog on enqueue.
      
      Be sure to clean-up rx queue memory on socket destruction, using
      udp its own sk_destruct.
      
      Tested using pktgen with random src port, 64 bytes packet,
      wire-speed on a 10G link as sender and udp_sink as the receiver,
      using an l4 tuple rxhash to stress the contention, and one or more
      udp_sink instances with reuseport.
      
      nr readers      Kpps (vanilla)  Kpps (patched)
      1               170             440
      3               1250            2150
      6               3000            3650
      9               4200            4450
      12              5700            6250
      
      v4 -> v5:
        - avoid unneeded test in first_packet_length
      
      v3 -> v4:
        - remove useless sk_rcvqueues_full() call
      
      v2 -> v3:
        - do not set the now unsed backlog_rcv callback
      
      v1 -> v2:
        - add memory pressure support
        - fixed dropwatch accounting for ipv6
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      850cbadd
    • Paolo Abeni's avatar
      udp: implement memory accounting helpers · f970bd9e
      Paolo Abeni authored
      Avoid using the generic helpers.
      Use the receive queue spin lock to protect the memory
      accounting operation, both on enqueue and on dequeue.
      
      On dequeue perform partial memory reclaiming, trying to
      leave a quantum of forward allocated memory.
      
      On enqueue use a custom helper, to allow some optimizations:
      - use a plain spin_lock() variant instead of the slightly
        costly spin_lock_irqsave(),
      - avoid dst_force check, since the calling code has already
        dropped the skb dst
      - avoid orphaning the skb, since skb_steal_sock() already did
        the work for us
      
      The above needs custom memory reclaiming on shutdown, provided
      by the udp_destruct_sock().
      
      v5 -> v6:
        - don't orphan the skb on enqueue
      
      v4 -> v5:
        - replace the mem_lock with the receive queue spin lock
        - ensure that the bh is always allowed to enqueue at least
          a skb, even if sk_rcvbuf is exceeded
      
      v3 -> v4:
        - reworked memory accunting, simplifying the schema
        - provide an helper for both memory scheduling and enqueuing
      
      v1 -> v2:
        - use a udp specific destrctor to perform memory reclaiming
        - remove a couple of helpers, unneeded after the above cleanup
        - do not reclaim memory on dequeue if not under memory
          pressure
        - reworked the fwd accounting schema to avoid potential
          integer overflow
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f970bd9e
    • Paolo Abeni's avatar
      net/socket: factor out helpers for memory and queue manipulation · f8c3bf00
      Paolo Abeni authored
      Basic sock operations that udp code can use with its own
      memory accounting schema. No functional change is introduced
      in the existing APIs.
      
      v4 -> v5:
        - avoid whitespace changes
      
      v2 -> v4:
        - avoid exporting __sock_enqueue_skb
      
      v1 -> v2:
        - avoid export sock_rmem_free
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f8c3bf00
  5. 21 Oct, 2016 1 commit