1. 21 Oct, 2010 21 commits
  2. 20 Oct, 2010 8 commits
    • Eric Dumazet's avatar
      net: avoid RCU for NOCACHE dst · 27b75c95
      Eric Dumazet authored
      There is no point using RCU for dst we allocate for a very short time
      (used once).
      
      Change dst_release() to take DST_NOCACHE into account, but also change
      skb_dst_set_noref() to force a refcount increment for such dst.
      
      This is a _huge_ gain, because we dont waste memory to store xx thousand
      of dsts. Instead of queueing them to RCU, we can free them instantly.
      
      CPU caches can stay hot, re-using same memory blocks to hold temporary
      dsts.
      
      Note : remove unneeded smp_mb__before_atomic_dec(); in dst_release(),
      since atomic_dec_return() implies a full memory barrier.
      
      Stress test, 160.000.000 udp frames sent, IP route cache disabled
      (DDOS).
      
      Before:
      
      real    0m38.091s
      user    0m13.189s
      sys     7m53.018s
      
      After:
      
      real	0m29.946s
      user	0m12.157s
      sys	7m40.605s
      
      For reference, if IP route cache was enabled :
      
      real	0m32.030s
      user	0m10.521s
      sys	8m15.243s
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      27b75c95
    • Tom Herbert's avatar
      net: allocate tx queues in register_netdevice · e6484930
      Tom Herbert authored
      This patch introduces netif_alloc_netdev_queues which is called from
      register_device instead of alloc_netdev_mq.  This makes TX queue
      allocation symmetric with RX allocation.  Also, queue locks allocation
      is done in netdev_init_one_queue.  Change set_real_num_tx_queues to
      fail if requested number < 1 or greater than number of allocated
      queues.
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Acked-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e6484930
    • Tom Herbert's avatar
      net: cleanups in RX queue allocation · bd25fa7b
      Tom Herbert authored
      Clean up in RX queue allocation.  In netif_set_real_num_rx_queues
      return error on attempt to set zero queues, or requested number is
      greater than number of allocated queues.  In netif_alloc_rx_queues,
      do BUG_ON if queue_count is zero.
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Acked-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bd25fa7b
    • Tom Herbert's avatar
      net: fail alloc_netdev_mq if queue count < 1 · 55513fb4
      Tom Herbert authored
      In alloc_netdev_mq fail if requested queue_count < 1.
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Acked-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      55513fb4
    • David S. Miller's avatar
    • Changli Gao's avatar
      phonet: remove the unused variable pn · c5e90f56
      Changli Gao authored
      Signed-off-by: default avatarChangli Gao <xiaosuo@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c5e90f56
    • Neil Horman's avatar
      netpoll: Revert napi_poll fix for bonding driver · f13d493d
      Neil Horman authored
      In an erlier patch I modified napi_poll so that devices with IFF_MASTER polled
      the per_cpu list instead of the device list for napi.  I did this because the
      bonding driver has no napi instances to poll, it instead expects to check the
      slave devices napi instances, which napi_poll was unaware of.  Looking at this
      more closely however, I now see this isn't strictly needed.  As the bond driver
      poll_controller calls the slaves poll_controller via netpoll_poll_dev, which
      recursively calls poll_napi on each slave, allowing those napi instances to get
      serviced.  The earlier patch isn't at all harmfull, its just not needed, so lets
      revert it to make the code cleaner.  Sorry for the noise,
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Reviewed-by: default avatarWANG Cong <amwang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f13d493d
    • Neil Horman's avatar
      netpoll: Remove netpoll blocking from uninit path · 9ff76c95
      Neil Horman authored
      Some recent testing in netpoll with bonding showed this backtrace
      
       ------------[ cut here ]------------
       kernel BUG at drivers/net/bonding/bonding.h:134!
       invalid opcode: 0000 [#1] SMP
       last sysfs file: /sys/devices/pci0000:00/0000:00:1d.2/usb7/devnum
       CPU 0
       Pid: 1876, comm: rmmod Not tainted 2.6.36-rc3+ #10 D26928/
       RIP: 0010:[<ffffffffa0514ba4>]  [<ffffffffa0514ba4>] bond_uninit+0x6f4/0x7a0
       RSP: 0018:ffff88003b1b5d58  EFLAGS: 00010296
       RAX: ffff88003b9b6200 RBX: ffff8800373e8e00 RCX: 00000000000f4240
       RDX: 00000000ffffffff RSI: 0000000000000286 RDI: 0000000000000286
       RBP: ffff88003b1b5dc8 R08: 0000000000000000 R09: 00000001af7de920
       R10: 0000000000000000 R11: ffff880002495e98 R12: ffff880037922700
       R13: ffff880038c31000 R14: ffff880037922730 R15: 0000000000000286
       FS:  00007f90e6d72700(0000) GS:ffff880002400000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
       CR2: 000000346f0d9ad0 CR3: 000000003b263000 CR4: 00000000000006f0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
       Process rmmod (pid: 1876, threadinfo ffff88003b1b4000, task ffff88003b36aa80)
       Stack:
       00000000ffffffff ffff88003b1b5d7a ffff8800379221e8 ffff880037922000
       <0> ffff88003b1b5dc8 ffffffff813eb5fb ffff88003b1b5da8 0000000031b177a3
       <0> ffff88003b1b5da8 ffff880037922000 ffff88003b1b5e48 ffff88003b1b5e48
       Call Trace:
       [<ffffffff813eb5fb>] ? rtmsg_ifinfo+0xcb/0xf0
       [<ffffffff813daad8>] rollback_registered_many+0x168/0x280
       [<ffffffff813dac09>] unregister_netdevice_many+0x19/0x80
       [<ffffffff813e97b3>] __rtnl_kill_links+0x63/0x90
       [<ffffffff813e980b>] __rtnl_link_unregister+0x2b/0x60
       [<ffffffff813e9bde>] rtnl_link_unregister+0x1e/0x30
       [<ffffffffa052124b>] bonding_exit+0x37/0x51 [bonding]
       [<ffffffff81098b2e>] sys_delete_module+0x19e/0x270
       [<ffffffff810bb2b2>] ? audit_syscall_entry+0x252/0x280
       [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
       RIP  [<ffffffffa0514ba4>] bond_uninit+0x6f4/0x7a0 [bonding]
       RSP <ffff88003b1b5d58>
       ---[ end trace 1395ad691cea24d1 ]---
      
      It occurs because of my recent netpoll blocking patches, which I added to avoid
      recursive deadlock in the bonding driver.  It relies on some per cpu bits, but
      the shutdown path forces some rescheduling as we cancel workqueues for the
      driver and wait for some device refcounts.  If after the forced reschedule, we
      wind up on a different cpu we trigger the bughalt in unblock_netpoll_tx.
      
      The fix is to remove the netpoll block/unblock calls from bond_release_all.
      This is safe to do because bond_uninit, which is called via ndo_uninit in
      rollback_registered_many, doesn't occur until we send a NETDEV_UNREGISTER event,
      which triggers netconsole to remove us as a netpoll client, so we are guaranteed
      not to recurse into our own tx path here.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Reviewed-by: default avatarWANG Cong <amwang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9ff76c95
  3. 19 Oct, 2010 6 commits
  4. 18 Oct, 2010 5 commits
    • Neil Horman's avatar
      bonding: Re-enable netpoll over bonding · 45b0cb8a
      Neil Horman authored
      With the inclusion of previous fixup patches, netpoll over bonding apears to
      work reliably with failover conditions.  This reverts Gospos previous commit
      c22d7ac8, and allows access again to the netpoll
      functionality in the bonding driver.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      45b0cb8a
    • Neil Horman's avatar
      bonding: Fix netconsole to not deadlock on rmmod · 3b410a31
      Neil Horman authored
      Netconsole calls netpoll_cleanup on receipt of a NETDEVICE_UNREGISTER event.
      The notifier subsystem calls these event handlers with rtnl_lock held, which
      netpoll_cleanup also takes, resulting in deadlock.  Fix this by calling the
      __netpoll_cleanup interior function instead, and fixing up the additional
      pointers.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3b410a31
    • Neil Horman's avatar
      bonding: Fix napi poll for bonding driver · 990c3d6f
      Neil Horman authored
      Usually the netpoll path, when preforming a napi poll can get away with just
      polling all the napi instances of the configured device.  Thats not the case for
      the bonding driver however, as the napi instances which may wind up getting
      flagged as needing polling after the poll_controller call don't belong to the
      bonded device, but rather to the slave devices.  Fix this by checking the device
      in question for the IFF_MASTER flag, if set, we know we need to check the full
      poll list for this cpu, rather than just the devices napi instance list.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      990c3d6f
    • Neil Horman's avatar
      bonding: Fix deadlock in bonding driver resulting from internal locking when using netpoll · e843fa50
      Neil Horman authored
      The monitoring paths in the bonding driver take write locks that are shared by
      the tx path.  If netconsole is in use, these paths can call printk which puts us
      in the netpoll tx path, which, if netconsole is attached to the bonding driver,
      result in deadlock (the xmit_lock guards are useless in netpoll_send_skb, as the
      monitor paths in the bonding driver don't claim the xmit_lock, nor should they).
      The solution is to use a per cpu flag internal to the driver to indicate when a
      cpu is holding the lock in a path that might recusrse into the tx path for the
      driver via netconsole.  By checking this flag on transmit, we can defer the
      sending of the netconsole frames until a later time using the retransmit feature
      of netpoll_send_skb that is triggered on the return code NETDEV_TX_BUSY.  I've
      tested this and am able to transmit via netconsole while causing failover
      conditions on the bond slave links.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e843fa50
    • Neil Horman's avatar
      bonding: Fix bonding drivers improper modification of netpoll structure · c2355e1a
      Neil Horman authored
      The bonding driver currently modifies the netpoll structure in its xmit path
      while sending frames from netpoll.  This is racy, as other cpus can access the
      netpoll structure in parallel. Since the bonding driver points np->dev to a
      slave device, other cpus can inadvertently attempt to send data directly to
      slave devices, leading to improper locking with the bonding master, lost frames,
      and deadlocks.  This patch fixes that up.
      
      This patch also removes the real_dev pointer from the netpoll structure as that
      data is really only used by bonding in the poll_controller, and we can emulate
      its behavior by check each slave for IS_UP.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c2355e1a