1. 29 Mar, 2015 13 commits
    • Haiyang Zhang's avatar
      hv_netvsc: Implement batching in send buffer · 7c3877f2
      Haiyang Zhang authored
      With this patch, we can send out multiple RNDIS data packets in one send buffer
      slot and one VMBus message. It reduces the overhead associated with VMBus messages.
      Signed-off-by: default avatarHaiyang Zhang <haiyangz@microsoft.com>
      Reviewed-by: default avatarK. Y. Srinivasan <kys@microsoft.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7c3877f2
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · 4ef295e0
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter updates for net-next
      
      The following patchset contains Netfilter updates for your net-next tree.
      Basically, nf_tables updates to add the set extension infrastructure and finish
      the transaction for sets from Patrick McHardy. More specifically, they are:
      
      1) Move netns to basechain and use recently added possible_net_t, from
         Patrick McHardy.
      
      2) Use LOGLEVEL_<FOO> from nf_log infrastructure, from Joe Perches.
      
      3) Restore nf_log_trace that was accidentally removed during conflict
         resolution.
      
      4) nft_queue does not depend on NETFILTER_XTABLES, starting from here
         all patches from Patrick McHardy.
      
      5) Use raw_smp_processor_id() in nft_meta.
      
      Then, several patches to prepare ground for the new set extension
      infrastructure:
      
      6) Pass object length to the hash callback in rhashtable as needed by
         the new set extension infrastructure.
      
      7) Cleanup patch to restore struct nft_hash as wrapper for struct
         rhashtable
      
      8) Another small source code readability cleanup for nft_hash.
      
      9) Convert nft_hash to rhashtable callbacks.
      
      And finally...
      
      10) Add the new set extension infrastructure.
      
      11) Convert the nft_hash and nft_rbtree sets to use it.
      
      12) Batch set element release to avoid several RCU grace period in a row
          and add new function nft_set_elem_destroy() to consolidate set element
          release.
      
      13) Return the set extension data area from nft_lookup.
      
      14) Refactor existing transaction code to add some helper functions
          and document it.
      
      15) Complete the set transaction support, using similar approach to what we
          already use, to activate/deactivate elements in an atomic fashion.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4ef295e0
    • David S. Miller's avatar
      Merge branch 'tipc-next' · ae7633c8
      David S. Miller authored
      Ying Xue says:
      
      ====================
      tipc: fix two corner issues
      
      The patch set aims at resolving the following two critical issues:
      
      Patch #1: Resolve a deadlock which happens while all links are reset
      Patch #2: Correct a mistake usage of RCU lock which is used to protect
                node list
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ae7633c8
    • Ying Xue's avatar
      tipc: involve reference counter for node structure · 8a0f6ebe
      Ying Xue authored
      TIPC node hash node table is protected with rcu lock on read side.
      tipc_node_find() is used to look for a node object with node address
      through iterating the hash node table. As the entire process of what
      tipc_node_find() traverses the table is guarded with rcu read lock,
      it's safe for us. However, when callers use the node object returned
      by tipc_node_find(), there is no rcu read lock applied. Therefore,
      this is absolutely unsafe for callers of tipc_node_find().
      
      Now we introduce a reference counter for node structure. Before
      tipc_node_find() returns node object to its caller, it first increases
      the reference counter. Accordingly, after its caller used it up,
      it decreases the counter again. This can prevent a node being used by
      one thread from being freed by another thread.
      Reviewed-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: default avatarJon Maloy <jon.maloy@ericson.com>
      Signed-off-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8a0f6ebe
    • Ying Xue's avatar
      tipc: fix potential deadlock when all links are reset · b952b2be
      Ying Xue authored
      [   60.988363] ======================================================
      [   60.988754] [ INFO: possible circular locking dependency detected ]
      [   60.989152] 3.19.0+ #194 Not tainted
      [   60.989377] -------------------------------------------------------
      [   60.989781] swapper/3/0 is trying to acquire lock:
      [   60.990079]  (&(&n_ptr->lock)->rlock){+.-...}, at: [<ffffffffa0006dca>] tipc_link_retransmit+0x1aa/0x240 [tipc]
      [   60.990743]
      [   60.990743] but task is already holding lock:
      [   60.991106]  (&(&bclink->lock)->rlock){+.-...}, at: [<ffffffffa00004be>] tipc_bclink_lock+0x8e/0xa0 [tipc]
      [   60.991738]
      [   60.991738] which lock already depends on the new lock.
      [   60.991738]
      [   60.992174]
      [   60.992174] the existing dependency chain (in reverse order) is:
      [   60.992174]
      -> #1 (&(&bclink->lock)->rlock){+.-...}:
      [   60.992174]        [<ffffffff810a9c0c>] lock_acquire+0x9c/0x140
      [   60.992174]        [<ffffffff8179c41f>] _raw_spin_lock_bh+0x3f/0x50
      [   60.992174]        [<ffffffffa00004be>] tipc_bclink_lock+0x8e/0xa0 [tipc]
      [   60.992174]        [<ffffffffa0000f57>] tipc_bclink_add_node+0x97/0xf0 [tipc]
      [   60.992174]        [<ffffffffa0011815>] tipc_node_link_up+0xf5/0x110 [tipc]
      [   60.992174]        [<ffffffffa0007783>] link_state_event+0x2b3/0x4f0 [tipc]
      [   60.992174]        [<ffffffffa00193c0>] tipc_link_proto_rcv+0x24c/0x418 [tipc]
      [   60.992174]        [<ffffffffa0008857>] tipc_rcv+0x827/0xac0 [tipc]
      [   60.992174]        [<ffffffffa0002ca3>] tipc_l2_rcv_msg+0x73/0xd0 [tipc]
      [   60.992174]        [<ffffffff81646e66>] __netif_receive_skb_core+0x746/0x980
      [   60.992174]        [<ffffffff816470c1>] __netif_receive_skb+0x21/0x70
      [   60.992174]        [<ffffffff81647295>] netif_receive_skb_internal+0x35/0x130
      [   60.992174]        [<ffffffff81648218>] napi_gro_receive+0x158/0x1d0
      [   60.992174]        [<ffffffff81559e05>] e1000_clean_rx_irq+0x155/0x490
      [   60.992174]        [<ffffffff8155c1b7>] e1000_clean+0x267/0x990
      [   60.992174]        [<ffffffff81647b60>] net_rx_action+0x150/0x360
      [   60.992174]        [<ffffffff8105ec43>] __do_softirq+0x123/0x360
      [   60.992174]        [<ffffffff8105f12e>] irq_exit+0x8e/0xb0
      [   60.992174]        [<ffffffff8179f9f5>] do_IRQ+0x65/0x110
      [   60.992174]        [<ffffffff8179da6f>] ret_from_intr+0x0/0x13
      [   60.992174]        [<ffffffff8100de9f>] arch_cpu_idle+0xf/0x20
      [   60.992174]        [<ffffffff8109dfa6>] cpu_startup_entry+0x2f6/0x3f0
      [   60.992174]        [<ffffffff81033cda>] start_secondary+0x13a/0x150
      [   60.992174]
      -> #0 (&(&n_ptr->lock)->rlock){+.-...}:
      [   60.992174]        [<ffffffff810a8f7d>] __lock_acquire+0x163d/0x1ca0
      [   60.992174]        [<ffffffff810a9c0c>] lock_acquire+0x9c/0x140
      [   60.992174]        [<ffffffff8179c41f>] _raw_spin_lock_bh+0x3f/0x50
      [   60.992174]        [<ffffffffa0006dca>] tipc_link_retransmit+0x1aa/0x240 [tipc]
      [   60.992174]        [<ffffffffa0001e11>] tipc_bclink_rcv+0x611/0x640 [tipc]
      [   60.992174]        [<ffffffffa0008646>] tipc_rcv+0x616/0xac0 [tipc]
      [   60.992174]        [<ffffffffa0002ca3>] tipc_l2_rcv_msg+0x73/0xd0 [tipc]
      [   60.992174]        [<ffffffff81646e66>] __netif_receive_skb_core+0x746/0x980
      [   60.992174]        [<ffffffff816470c1>] __netif_receive_skb+0x21/0x70
      [   60.992174]        [<ffffffff81647295>] netif_receive_skb_internal+0x35/0x130
      [   60.992174]        [<ffffffff81648218>] napi_gro_receive+0x158/0x1d0
      [   60.992174]        [<ffffffff81559e05>] e1000_clean_rx_irq+0x155/0x490
      [   60.992174]        [<ffffffff8155c1b7>] e1000_clean+0x267/0x990
      [   60.992174]        [<ffffffff81647b60>] net_rx_action+0x150/0x360
      [   60.992174]        [<ffffffff8105ec43>] __do_softirq+0x123/0x360
      [   60.992174]        [<ffffffff8105f12e>] irq_exit+0x8e/0xb0
      [   60.992174]        [<ffffffff8179f9f5>] do_IRQ+0x65/0x110
      [   60.992174]        [<ffffffff8179da6f>] ret_from_intr+0x0/0x13
      [   60.992174]        [<ffffffff8100de9f>] arch_cpu_idle+0xf/0x20
      [   60.992174]        [<ffffffff8109dfa6>] cpu_startup_entry+0x2f6/0x3f0
      [   60.992174]        [<ffffffff81033cda>] start_secondary+0x13a/0x150
      [   60.992174]
      [   60.992174] other info that might help us debug this:
      [   60.992174]
      [   60.992174]  Possible unsafe locking scenario:
      [   60.992174]
      [   60.992174]        CPU0                    CPU1
      [   60.992174]        ----                    ----
      [   60.992174]   lock(&(&bclink->lock)->rlock);
      [   60.992174]                                lock(&(&n_ptr->lock)->rlock);
      [   60.992174]                                lock(&(&bclink->lock)->rlock);
      [   60.992174]   lock(&(&n_ptr->lock)->rlock);
      [   60.992174]
      [   60.992174]  *** DEADLOCK ***
      [   60.992174]
      [   60.992174] 3 locks held by swapper/3/0:
      [   60.992174]  #0:  (rcu_read_lock){......}, at: [<ffffffff81646791>] __netif_receive_skb_core+0x71/0x980
      [   60.992174]  #1:  (rcu_read_lock){......}, at: [<ffffffffa0002c35>] tipc_l2_rcv_msg+0x5/0xd0 [tipc]
      [   60.992174]  #2:  (&(&bclink->lock)->rlock){+.-...}, at: [<ffffffffa00004be>] tipc_bclink_lock+0x8e/0xa0 [tipc]
      [   60.992174]
      
      The correct the sequence of grabbing n_ptr->lock and bclink->lock
      should be that the former is first held and the latter is then taken,
      which exactly happened on CPU1. But especially when the retransmission
      of broadcast link is failed, bclink->lock is first held in
      tipc_bclink_rcv(), and n_ptr->lock is taken in link_retransmit_failure()
      called by tipc_link_retransmit() subsequently, which is demonstrated on
      CPU0. As a result, deadlock occurs.
      
      If the order of holding the two locks happening on CPU0 is reversed, the
      deadlock risk will be relieved. Therefore, the node lock taken in
      link_retransmit_failure() originally is moved to tipc_bclink_rcv()
      so that it's obtained before bclink lock. But the precondition of
      the adjustment of node lock is that responding to bclink reset event
      must be moved from tipc_bclink_unlock() to tipc_node_unlock().
      Reviewed-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Signed-off-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b952b2be
    • Li RongQing's avatar
      virtio: simplify the using of received in virtnet_poll · faadb05f
      Li RongQing authored
      received is 0, no need to minus it and use "+=" to reassign it
      Signed-off-by: default avatarLi RongQing <roy.qing.li@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      faadb05f
    • David S. Miller's avatar
      Merge branch 'be2net-next' · 3556eaaa
      David S. Miller authored
      Sathya Perla says:
      
      ====================
      be2net: patch set
      
      Hi David, this patch set includes 2 feature additions to the be2net driver:
      
      Patch 1 sets up cpu affinity hints for be2net irqs using the
      cpumask_set_cpu_local_first() API that first picks the near numa cores
      and when they are exhausted, selects the far numa cores.
      
      Patch 2 setups up xps queue mapping for be2net's TXQs to avoid,
      by default, TX lock contention.
      
      Patch 3 just bumps up the driver version.
      
      Pls consider applying this patch set to the net-next queue. Thanks!
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3556eaaa
    • Sathya Perla's avatar
    • Sathya Perla's avatar
      be2net: setup xps queue mapping · 73f394e6
      Sathya Perla authored
      This patch sets up xps queue mapping on load, so that TX traffic is
      steered to the queue whose irqs are being processed by the current cpu.
      This helps in avoiding TX lock contention.
      Signed-off-by: default avatarSathya Perla <sathya.perla@emulex.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      73f394e6
    • Padmanabh Ratnakar's avatar
      be2net: assign CPU affinity hints to be2net IRQs · d658d98a
      Padmanabh Ratnakar authored
      This patch provides hints to irqbalance to map be2net IRQs to
      specific CPU cores. cpumask_set_cpu_local_first() is used, which first
      maps IRQs to near NUMA cores; when those cores are exhausted, IRQs are
      mapped to far NUMA cores.
      Signed-off-by: default avatarPadmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
      Signed-off-by: default avatarSathya Perla <sathya.perla@emulex.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d658d98a
    • Eric Dumazet's avatar
      tcp: tcp_syn_flood_action() can be static · 41d25fe0
      Eric Dumazet authored
      After commit 1fb6f159 ("tcp: add tcp_conn_request"),
      tcp_syn_flood_action() is no longer used from IPv6.
      
      We can make it static, by moving it above tcp_conn_request()
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarOctavian Purdila <octavian.purdila@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      41d25fe0
    • Wu Fengguang's avatar
      cxgb4: fix boolreturn.cocci warnings · 1fb7cd4e
      Wu Fengguang authored
      drivers/net/ethernet/chelsio/cxgb4/cxgb4_fcoe.c:49:9-10: WARNING: return of 0/1 in function 'cxgb_fcoe_sof_eof_supported' with return type bool
      
       Return statements in functions returning bool should use
       true/false instead of 1/0.
      Generated by: scripts/coccinelle/misc/boolreturn.cocci
      
      CC: Varun Prakash <varun@chelsio.com>
      Signed-off-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1fb7cd4e
    • WANG Cong's avatar
      fib6: install fib6 ops in the last step · 85b99092
      WANG Cong authored
      We should not commit the new ops until we finish
      all the setup, otherwise we have to NULL it on failure.
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      85b99092
  2. 27 Mar, 2015 12 commits
  3. 26 Mar, 2015 4 commits
    • Patrick McHardy's avatar
      netfilter: nf_tables: implement set transaction support · cc02e457
      Patrick McHardy authored
      Set elements are the last object type not supporting transaction support.
      Implement similar to the existing rule transactions:
      
      The global transaction counter keeps track of two generations, current
      and next. Each element contains a bitmask specifying in which generations
      it is inactive.
      
      New elements start out as inactive in the current generation and active
      in the next. On commit, the previous next generation becomes the current
      generation and the element becomes active. The bitmask is then cleared
      to indicate that the element is active in all future generations. If the
      transaction is aborted, the element is removed from the set before it
      becomes active.
      
      When removing an element, it gets marked as inactive in the next generation.
      On commit the next generation becomes active and the therefor the element
      inactive. It is then taken out of then set and released. On abort, the
      element is marked as active for the next generation again.
      
      Lookups ignore elements not active in the current generation.
      
      The current set types (hash/rbtree) both use a field in the extension area
      to store the generation mask. This (currently) does not require any
      additional memory since we have some free space in there.
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      cc02e457
    • Patrick McHardy's avatar
      netfilter: nf_tables: add transaction helper functions · ea4bd995
      Patrick McHardy authored
      Add some helper functions for building the genmask as preparation for
      set transactions.
      
      Also add a little documentation how this stuff actually works.
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      ea4bd995
    • Patrick McHardy's avatar
      netfilter: nf_tables: return set extensions from ->lookup() · b2832dd6
      Patrick McHardy authored
      Return the extension area from the ->lookup() function to allow to
      consolidate common actions.
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      b2832dd6
    • Patrick McHardy's avatar
      netfilter: nf_tables: consolide set element destruction · 61edafbb
      Patrick McHardy authored
      With the conversion to set extensions, it is now possible to consolidate
      the different set element destruction functions.
      
      The set implementations' ->remove() functions are changed to only take
      the element out of their internal data structures. Elements will be freed
      in a batched fashion after the global transaction's completion RCU grace
      period.
      
      This reduces the amount of grace periods required for nft_hash from N
      to zero additional ones, additionally this guarantees that the set
      elements' extensions of all implementations can be used under RCU
      protection.
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      61edafbb
  4. 25 Mar, 2015 11 commits