Commits · 448d7b5daf043d109df98e3e8f8deb165c2e8896 · nexedi / linux

28 Oct, 2010 18 commits

pktgen: Limit how much data we copy onto the stack. · 448d7b5d

Nelson Elhage authored Oct 28, 2010

A program that accidentally writes too much data to the pktgen file can overflow
the kernel stack and oops the machine. This is only triggerable by root, so
there's no security issue, but it's still an unfortunate bug.

printk() won't print more than 1024 bytes in a single call, anyways, so let's
just never copy more than that much data. We're on a fairly shallow stack, so
that should be safe even with CONFIG_4KSTACKS.
Signed-off-by: Nelson Elhage <nelhage@ksplice.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

448d7b5d

net: Limit socket I/O iovec total length to INT_MAX. · 8acfe468

David S. Miller authored Oct 28, 2010

This helps protect us from overflow issues down in the
individual protocol sendmsg/recvmsg handlers.  Once
we hit INT_MAX we truncate out the rest of the iovec
by setting the iov_len members to zero.

This works because:

1) For SOCK_STREAM and SOCK_SEQPACKET sockets, partial
   writes are allowed and the application will just continue
   with another write to send the rest of the data.

2) For datagram oriented sockets, where there must be a
   one-to-one correspondance between write() calls and
   packets on the wire, INT_MAX is going to be far larger
   than the packet size limit the protocol is going to
   check for and signal with -EMSGSIZE.

Based upon a patch by Linus Torvalds.
Signed-off-by: David S. Miller <davem@davemloft.net>

8acfe468

USB: gadget: fix ethernet gadget crash in gether_setup · 349f6c5c

Dmitry Artamonow authored Oct 28, 2010

Crash is triggered by commit e6484930 ("net: allocate tx queues in
register_netdevice"), which moved tx netqueue creation into register_netdev.
So now calling netif_stop_queue() before register_netdev causes an oops.
Move netif_stop_queue() after net device registration to fix crash.
Signed-off-by: Dmitry Artamonow <mad_soft@inbox.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>

349f6c5c

fib: Fix fib zone and its hash leak on namespace stop · 4aa2c466

Pavel Emelyanov authored Oct 28, 2010

When we stop a namespace we flush the table and free one, but the
added fn_zone-s (and their hashes if grown) are leaked. Need to free.
Tries releases all its stuff in the flushing code.

Shame on us - this bug exists since the very first make-fib-per-net
patches in 2.6.27 :(
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

4aa2c466

cxgb3: Fix panic in free_tx_desc() · b1424ed9

Krishna Kumar authored Oct 27, 2010

I got a few of these panics (on 2.6.36-rc7) when running high
number of netperf sessions:

BUG: unable to handle kernel paging request at 0000100000000000
IP: [<ffffffff813125f0>] skb_release_data+0xa0/0xd0
Oops: 0000 [#1] SMP
Pid: 2155, comm: vhost-2115 Not tainted 2.6.36-rc7-ORG #1 49Y6512     /System x3650 M2 -[7947AC1]-
RIP: 0010:[<ffffffff813125f0>]  [<ffffffff813125f0>] skb_release_data+0xa0/0xd0
RSP: 0018:ffff880001803738  EFLAGS: 00010206
RAX: ffff880179b0fc00 RBX: ffff880178b441c0 RCX: 0000000000000000
RSP: 0018:ffff880001803738  EFLAGS: 00010206
RAX: ffff880179b0fc00 RBX: ffff880178b441c0 RCX: 0000000000000000
RDX: ffff880179b0fd40 RSI: 0000000000000000 RDI: 0000100000000000
RBP: ffff880001803748 R08: 0000000000000001 R09: ffff88017f117000
R10: ffff88017b990608 R11: ffff88017f117090 R12: ffff880178b441c0
R13: ffff88017f117090 R14: 0000000000000000 R15: ffff880178b441c0
FS:  0000000000000000(0000) GS:ffff880001800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000100000000000 CR3: 000000017ea64000 CR4: 00000000000026e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process vhost-2115 (pid: 2155, threadinfo ffff88017d872000, task ffff88017e954680)
Stack:
ffff880178b441c0 0000000000000007 ffff880001803768 ffffffff81312119
<0> 0000000000000000 0000000000000002 ffff880001803778 ffffffff813121f9
<0> ffff880001803818 ffffffffa012d14c ffffffffa02de076 ffff880001803700
Call Trace:
<IRQ>
[<ffffffff81312119>] __kfree_skb+0x19/0xa0
[<ffffffff813121f9>] kfree_skb+0x19/0x40
[<ffffffffa012d14c>] free_tx_desc+0x2fc/0x350 [cxgb3]
[<ffffffffa02de076>] ? vhost_poll_wakeup+0x16/0x20 [vhost_net]
[<ffffffffa01323db>] t3_eth_xmit+0x28b/0x380 [cxgb3]
[<ffffffff8131ce47>] dev_hard_start_xmit+0x377/0x5a0
[<ffffffff81335a4a>] sch_direct_xmit+0xfa/0x1d0
[<ffffffff8131d1a9>] dev_queue_xmit+0x139/0x450
[<ffffffff81326225>] neigh_resolve_output+0x125/0x340
[<ffffffff8135a77c>] ip_finish_output+0x14c/0x320
[<ffffffff8135a9fe>] ip_output+0xae/0xc0
[<ffffffff8135620f>] ip_forward_finish+0x3f/0x50
[<ffffffff8135641f>] ip_forward+0x1ff/0x400
[<ffffffff81354789>] ip_rcv_finish+0x119/0x3e0
[<ffffffff81354c7d>] ip_rcv+0x22d/0x300
[<ffffffff8131a95b>] __netif_receive_skb+0x29b/0x570
[<ffffffff8131ba70>] ? netif_receive_skb+0x0/0x80
[<ffffffff8131bae8>] netif_receive_skb+0x78/0x80
[<ffffffffa02a96d8>] br_handle_frame_finish+0x198/0x260 [bridge]
[<ffffffffa02aebc8>] br_nf_pre_routing_finish+0x238/0x380 [bridge]
[<ffffffff813424bc>] ? nf_hook_slow+0x6c/0x100
[<ffffffffa02ae990>] ? br_nf_pre_routing_finish+0x0/0x380 [bridge]
[<ffffffffa02afb08>] br_nf_pre_routing+0x698/0x7a0 [bridge]
[<ffffffff81342414>] nf_iterate+0x64/0xa0
[<ffffffffa02a9540>] ? br_handle_frame_finish+0x0/0x260 [bridge]
[<ffffffff813424bc>] nf_hook_slow+0x6c/0x100
[<ffffffffa02a9540>] ? br_handle_frame_finish+0x0/0x260 [bridge]
[<ffffffffa02a9931>] br_handle_frame+0x191/0x240 [bridge]
[<ffffffffa02a97a0>] ? br_handle_frame+0x0/0x240 [bridge]
[<ffffffff8131a863>] __netif_receive_skb+0x1a3/0x570
[<ffffffff812ef3f6>] ? dma_issue_pending_all+0x76/0xa0
[<ffffffff8131ad32>] process_backlog+0x102/0x200
[<ffffffff8131c2d0>] net_rx_action+0x100/0x220
[<ffffffff810548ef>] __do_softirq+0xaf/0x140
[<ffffffff8100bcdc>] call_softirq+0x1c/0x30
[<ffffffff8100dfc5>] ? do_softirq+0x65/0xa0
[<ffffffff8131c6b8>] netif_rx_ni+0x28/0x30
[<ffffffffa02c305d>] tun_sendmsg+0x2cd/0x4b0 [tun]
[<ffffffffa02e01af>] handle_tx+0x1df/0x340 [vhost_net]
[<ffffffffa02e0340>] handle_tx_kick+0x10/0x20 [vhost_net]
[<ffffffffa02de29b>] vhost_worker+0xbb/0x130 [vhost_net]
[<ffffffffa02de1e0>] ? vhost_worker+0x0/0x130 [vhost_net]
[<ffffffffa02de1e0>] ? vhost_worker+0x0/0x130 [vhost_net]
[<ffffffff81069686>] kthread+0x96/0xa0
[<ffffffff8100bbe4>] kernel_thread_helper+0x4/0x10
[<ffffffff810695f0>] ? kthread+0x0/0xa0
[<ffffffff8100bbe0>] ? kernel_thread_helper+0x0/0x10
Code: 8b 94 24 d0 00 00 00 49 8b 84 24 d8 00 00 00 48 8d 14 10 0f b7 0a 39 d9 7f d1 48 8b 7a 10 48 85 ff 74 20 48 c7 42 10 00 00 00 00 <48> 8b 1f e8 e8 fb ff ff 48 85 db 48 89 df 75 f0 49 8b 84 24 d8

Patch below fixes the panic. cxgb4 and cxgb4vf already have this fix.
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b1424ed9

cxgb3: fix crash due to manipulating queues before registration · 69dcfc8a

Nishanth Aravamudan authored Oct 27, 2010

Along the same lines as "cxgb4: fix crash due to manipulating queues
before registration" (8f6d9f40), before
commit "net: allocate tx queues in register_netdevice"
netif_tx_stop_all_queues and related functions could be used between
device allocation and registration but now only after registration.
cxgb4 has such a call before registration and crashes now.  Move it
after register_netdev.
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: eric.dumazet@gmail.com
Cc: sonnyrao@us.ibm.com
Cc: Divy Le Ray <divy@chelsio.com>
Cc: Dimitris Michailidis <dm@chelsio.com>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Tested-by: Nishanth Aravamudan <nacc@us.ibm.com>
Acked-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

69dcfc8a

8390: Don't oops on starting dev queue · b7126d8c

Pavel Emelyanov authored Oct 27, 2010

The __NS8390_init tries to start the device queue before the
device is registered. This results in an oops (snipped):

[    2.865493] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
[    2.866106] IP: [<ffffffffa000602a>] netif_start_queue+0xb/0x12 [8390]
[    2.881267] Call Trace:
[    2.881437]  [<ffffffffa000624d>] __NS8390_init+0x102/0x15a [8390]
[    2.881999]  [<ffffffffa00062ae>] NS8390_init+0x9/0xb [8390]
[    2.882237]  [<ffffffffa000d820>] ne2k_pci_init_one+0x297/0x354 [ne2k_pci]
[    2.882955]  [<ffffffff811c7a0e>] local_pci_probe+0x12/0x16
[    2.883308]  [<ffffffff811c85ad>] pci_device_probe+0xc3/0xef
[    2.884049]  [<ffffffff8129218d>] driver_probe_device+0xbe/0x14b
[    2.884937]  [<ffffffff81292260>] __driver_attach+0x46/0x62
[    2.885170]  [<ffffffff81291788>] bus_for_each_dev+0x49/0x78
[    2.885781]  [<ffffffff81291fbb>] driver_attach+0x1c/0x1e
[    2.886089]  [<ffffffff812912ab>] bus_add_driver+0xba/0x227
[    2.886330]  [<ffffffff8129259a>] driver_register+0x9e/0x115
[    2.886933]  [<ffffffff811c8815>] __pci_register_driver+0x50/0xac
[    2.887785]  [<ffffffffa001102c>] ne2k_pci_init+0x2c/0x2e [ne2k_pci]
[    2.888093]  [<ffffffff81000212>] do_one_initcall+0x7c/0x130
[    2.888693]  [<ffffffff8106d74f>] sys_init_module+0x99/0x1da
[    2.888946]  [<ffffffff81002a2b>] system_call_fastpath+0x16/0x1b

This happens because the netif_start_queue sets respective bit on the dev->_tx
array which is not yet allocated.

As far as I understand the code removing the netif_start_queue from __NS8390_init
is OK, since queue will be started later on device open. Plz, correct me if I'm wrong.

Found in the Dave's current tree, so he's in Cc.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

b7126d8c

dccp ccid-2: Stop polling · 1c0e0a05

Gerrit Renker authored Oct 27, 2010

This updates CCID-2 to use the CCID dequeuing mechanism, converting from
previous continuous-polling to a now event-driven mechanism.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

1c0e0a05

dccp: Refine the wait-for-ccid mechanism · b1fcf55e

Gerrit Renker authored Oct 27, 2010

This extends the existing wait-for-ccid routine so that it may be used with
different types of CCID, addressing the following problems:

 1) The queue-drain mechanism only works with rate-based CCIDs. If CCID-2 for
    example has a full TX queue and becomes network-limited just as the
    application wants to close, then waiting for CCID-2 to become unblocked
    could lead to an indefinite  delay (i.e., application "hangs").
 2) Since each TX CCID in turn uses a feedback mechanism, there may be changes
    in its sending policy while the queue is being drained. This can lead to
    further delays during which the application will not be able to terminate.
 3) The minimum wait time for CCID-3/4 can be expected to be the queue length
    times the current inter-packet delay. For example if tx_qlen=100 and a delay
    of 15 ms is used for each packet, then the application would have to wait
    for a minimum of 1.5 seconds before being allowed to exit.
 4) There is no way for the user/application to control this behaviour. It would
    be good to use the timeout argument of dccp_close() as an upper bound. Then
    the maximum time that an application is willing to wait for its CCIDs to can
    be set via the SO_LINGER option.

These problems are addressed by giving the CCID a grace period of up to the
`timeout' value.

The wait-for-ccid function is, as before, used when the application
 (a) has read all the data in its receive buffer and
 (b) if SO_LINGER was set with a non-zero linger time, or
 (c) the socket is either in the OPEN (active close) or in the PASSIVE_CLOSEREQ
     state (client application closes after receiving CloseReq).

In addition, there is a catch-all case of __skb_queue_purge() after waiting for
the CCID. This is necessary since the write queue may still have data when
 (a) the host has been passively-closed,
 (b) abnormal termination (unread data, zero linger time),
 (c) wait-for-ccid could not finish within the given time limit.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

b1fcf55e

dccp: Extend CCID packet dequeueing interface · dc841e30

Gerrit Renker authored Oct 27, 2010

This extends the packet dequeuing interface of dccp_write_xmit() to allow
 1. CCIDs to take care of timing when the next packet may be sent;
 2. delayed sending (as before, with an inter-packet gap up to 65.535 seconds).

The main purpose is to take CCID-2 out of its polling mode (when it is network-
limited, it tries every millisecond to send, without interruption).

The mode of operation for (2) is as follows:
 * new packet is enqueued via dccp_sendmsg() => dccp_write_xmit(),
 * ccid_hc_tx_send_packet() detects that it may not send (e.g. window full),
 * it signals this condition via `CCID_PACKET_WILL_DEQUEUE_LATER',
 * dccp_write_xmit() returns without further action;
 * after some time the wait-condition for CCID becomes true,
 * that CCID schedules the tasklet,
 * tasklet function calls ccid_hc_tx_send_packet() via dccp_write_xmit(),
 * since the wait-condition is now true, ccid_hc_tx_packet() returns "send now",
 * packet is sent, and possibly more (since dccp_write_xmit() loops).

Code reuse: the taskled function calls dccp_write_xmit(), the timer function
            reduces to a wrapper around the same code.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

dc841e30

dccp: Return-value convention of hc_tx_send_packet() · fe84f414

Gerrit Renker authored Oct 27, 2010

This patch reorganises the return value convention of the CCID TX sending
function, to permit more flexible schemes, as required by subsequent patches.

Currently the convention is
 * values < 0     mean error,
 * a value == 0   means "send now", and
 * a value x > 0  means "send in x milliseconds".

The patch provides symbolic constants and a function to interpret return values.

In addition, it caps the maximum positive return value to 0xFFFF milliseconds,
corresponding to 65.535 seconds.  This is possible since in CCID-3/4 the
maximum possible inter-packet gap is fixed at t_mbi = 64 sec.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

fe84f414

igbvf: fix panic on load · de7fe787

Emil Tantilov authored Oct 28, 2010

Introduced by commit:e6484930
net: allocate tx queues in register_netdevice
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Acked-by: Greg Rose <greg.v.rose@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

de7fe787

ixgb: call pci_disable_device in ixgb_remove · ec43a81c

Emil Tantilov authored Oct 28, 2010

ixgb fails to work after reload on recent kernels:

rmmod ixgb (dev->current_state = PCI_UNKNOWN)
modprobe ixgb (pci_enable_device will bail leaving current_state to PCI_UNKNOWN)
ifup eth0
do_IRQ: 2.82 No irq handler for vector (irq -1)

The issue was exposed by commit fcd097f3
PCI: MSI: Remove unsafe and unnecessary hardware access

which avoids HW writes for power states != PCI_D0

CC: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ec43a81c

ixgbe: DCB, fix TX hang occurring in stress condition with PFC · 9806307a

John Fastabend authored Oct 28, 2010

The DCB credits refill quantum _must_ be greater than half the max
packet size. This is needed to guarantee that TX DMA operations
are not attempted during a pause state. Additionally, the min IFG
must be set correctly for DCB mode. If a DMA operation is
requested unexpectedly during the pause state the HW data
store may be corrupted leading to a DMA hang.  The DMA hang
requires a reset to correct. This fixes the HW configuration
to avoid this condition.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9806307a

e1000e: Add check for reset flags before displaying reset message · affa9dfb

Carolyn Wyborny authored Oct 28, 2010

Some parts need to execute resets during normal operation.  This flag
check ensures that those parts reset without needlessly alarming the
user.  Other unexpected resets by other parts will dump debug info
and message the reset action to the user, as originally intended.
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Acked-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

affa9dfb

e1000e: reset PHY after errors detected · ff10e13c

Carolyn Wyborny authored Oct 28, 2010

Some errors can be induced in the PHY via environmental testing
(specifically extreme temperature changes and electro static
discharge testing), and in the case of the PHY hanging due to
this input, this detects the problem and resets to continue.
This issue only applies to 82574 silicon.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ff10e13c

pch_gbe: Select MII. · 116c1ea0

David S. Miller authored Oct 28, 2010

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

116c1ea0

igb: Fix unused variable warning. · c1758012

Jesse Gross authored Oct 26, 2010

Commit eab6d18d "vlan: Don't check for vlan group before
vlan_tx_tag_present" removed the need for the adapter variable
in igb_xmit_frame_ring_adv().  This removes the variable as well
to avoid the compiler warning.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c1758012

27 Oct, 2010 22 commits

ehea: Fixing statistics · ce45b873

Breno Leitao authored Oct 27, 2010

(Applied over Eric's "ehea: fix use after free" patch)

Currently ehea stats are broken. The bytes counters are got from
the hardware, while the packets counters are got from the device
driver. Also, the device driver counters are resetted during the
the down process, and the hardware aren't, causing some weird
numbers.

This patch just consolidates the packets and bytes on the device
driver.
Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ce45b873

bonding: Fix lockdep warning after bond_vlan_rx_register() · a71fb881

Jarek Poplawski authored Oct 27, 2010

Fix lockdep warning:
[   52.991402] ======================================================
[   52.991511] [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ]
[   52.991569] 2.6.36-04573-g4b60626-dirty #65
[   52.991622] ------------------------------------------------------
[   52.991696] ip/4842 [HC0[0]:SC0[4]:HE1:SE0] is trying to acquire:
[   52.991758]  (&bond->lock){++++..}, at: [<efe4d300>] bond_set_multicast_list+0x60/0x2c0 [bonding]
[   52.991966]
[   52.991967] and this task is already holding:
[   52.992008]  (&bonding_netdev_addr_lock_key){+.....}, at: [<c04e5530>] dev_mc_sync+0x50/0xa0
[   52.992008] which would create a new lock dependency:
[   52.992008]  (&bonding_netdev_addr_lock_key){+.....} -> (&bond->lock){++++..}
[   52.992008]
[   52.992008] but this new dependency connects a SOFTIRQ-irq-safe lock:
[   52.992008]  (&(&mc->mca_lock)->rlock){+.-...}
[   52.992008] ... which became SOFTIRQ-irq-safe at:
[   52.992008]   [<c0272beb>] __lock_acquire+0x96b/0x1960
[   52.992008]   [<c027415e>] lock_acquire+0x7e/0xf0
[   52.992008]   [<c05f356d>] _raw_spin_lock_bh+0x3d/0x50
[   52.992008]   [<c0584e40>] mld_ifc_timer_expire+0xf0/0x280
[   52.992008]   [<c024cee6>] run_timer_softirq+0x146/0x310
[   52.992008]   [<c024591d>] __do_softirq+0xad/0x1c0
[   52.992008]
[   52.992008] to a SOFTIRQ-irq-unsafe lock:
[   52.992008]  (&bond->lock){++++..}
[   52.992008] ... which became SOFTIRQ-irq-unsafe at:
[   52.992008] ...  [<c0272c3b>] __lock_acquire+0x9bb/0x1960
[   52.992008]   [<c027415e>] lock_acquire+0x7e/0xf0
[   52.992008]   [<c05f36b8>] _raw_write_lock+0x38/0x50
[   52.992008]   [<efe4cbe4>] bond_vlan_rx_register+0x24/0x70 [bonding]
[   52.992008]   [<c0598010>] register_vlan_dev+0xc0/0x280
[   52.992008]   [<c0599f3a>] vlan_newlink+0xaa/0xd0
[   52.992008]   [<c04ed4b4>] rtnl_newlink+0x404/0x490
[   52.992008]   [<c04ece35>] rtnetlink_rcv_msg+0x1e5/0x220
[   52.992008]   [<c050424e>] netlink_rcv_skb+0x8e/0xb0
[   52.992008]   [<c04ecbac>] rtnetlink_rcv+0x1c/0x30
[   52.992008]   [<c0503bfb>] netlink_unicast+0x24b/0x290
[   52.992008]   [<c0503e37>] netlink_sendmsg+0x1f7/0x310
[   52.992008]   [<c04cd41c>] sock_sendmsg+0xac/0xe0
[   52.992008]   [<c04ceb80>] sys_sendmsg+0x130/0x230
[   52.992008]   [<c04cf04e>] sys_socketcall+0xde/0x280
[   52.992008]   [<c0202d10>] sysenter_do_call+0x12/0x36
[   52.992008]
[   52.992008] other info that might help us debug this:
...
[ Full info at netdev: Wed, 27 Oct 2010 12:24:30 +0200
  Subject: [BUG net-2.6 vlan/bonding] lockdep splats ]

Use BH variant of write_lock(&bond->lock) (as elsewhere in bond_main)
to prevent this dependency.

Fixes commit f35188fa [v2.6.36]
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>

a71fb881

tunnels: Fix tunnels change rcu protection · 74b0b85b

Pavel Emelyanov authored Oct 27, 2010

After making rcu protection for tunnels (ipip, gre, sit and ip6) a bug
was introduced into the SIOCCHGTUNNEL code.

The tunnel is first unlinked, then addresses change, then it is linked
back probably into another bucket. But while changing the parms, the
hash table is unlocked to readers and they can lookup the improper tunnel.

Respective commits are b7285b79 (ipip: get rid of ipip_lock), 1507850b
(gre: get rid of ipgre_lock), 3a43be3c (sit: get rid of ipip6_lock) and
94767632 (ip6tnl: get rid of ip6_tnl_lock).

The quick fix is to wait for quiescent state to pass after unlinking,
but if it is inappropriate I can invent something better, just let me
know.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

74b0b85b

caif-u5500: Build config for CAIF shared mem driver · 1933f0c0

Amarnath Revanna authored Oct 27, 2010

Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1933f0c0

caif-u5500: CAIF shared memory mailbox interface · e57731f4

Amarnath Revanna authored Oct 27, 2010

Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e57731f4

caif-u5500: CAIF shared memory transport protocol · dfae55d6

sjur.brandeland@stericsson.com authored Oct 27, 2010

Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dfae55d6

caif-u5500: Adding shared memory include · a10c0203

Amarnath Revanna authored Oct 27, 2010

Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a10c0203

drivers/isdn: delete double assignment · 4101e976

Julia Lawall authored Oct 26, 2010

Delete successive assignments to the same location.  In the first case, the
hscx array has two elements, so change the assignment to initialize the
second one.  In the second case, the two assignments are simply identical.
Furthermore, neither is necessary, because the effect of the assignment is
only visible in the next line, in the assignment in the if test.  The patch
inlines the right hand side value in the latter assignment and pulls that
assignment out of the if test.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression i;
@@

*i = ...;
 i = ...;
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>

4101e976

drivers/net/typhoon.c: delete double assignment · 13c3ab86

Julia Lawall authored Oct 26, 2010

Delete successive assignments to the same location.  The current definition
does not initialize the respRing structure, which has the same type as the
cmdRing structure, so initialize that one instead.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression i;
@@

*i = ...;
 i = ...;
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: David Dillow <dave@thedillows.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

13c3ab86

drivers/net/sb1000.c: delete double assignment · d58c0e95

Julia Lawall authored Oct 26, 2010

The other code around these duplicated assignments initializes the 0 1 2
and 3 elements of an array, so change the initialization of the
rx_session_id array to do the same.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression i;
@@

*i = ...;
 i = ...;
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>

d58c0e95

qlcnic: define valid vlan id range · 0184bbba

Sony Chacko authored Oct 26, 2010

4095 vlan id is reserved and should not be use.
Signed-off-by: Sony Chacko <sony.chacko@qlogic.com>
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0184bbba

qlcnic: reduce rx ring size · 90d19005

Sony Chacko authored Oct 26, 2010

If eswitch is enabled, rcv ring size can be reduce, as
physical port is partition-ed.
Signed-off-by: Sony Chacko <sony.chacko@qlogic.com>
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

90d19005

qlcnic: fix mac learning · e5edb7b1

amit salecha authored Oct 26, 2010

In failover bonding case, same mac address can be programmed on other slave function.
Fw will delete old entry (original func) associated with that mac address.
Need to reporgram mac address, if failover again happen to original function.
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e5edb7b1

ehea: fix use after free · e5ccd961

Eric Dumazet authored Oct 26, 2010

ehea_start_xmit() dereferences skb after its freeing in ehea_xmit3() to
get vlan tags.

Move the offending block before the potential ehea_xmit3() call.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e5ccd961

inetpeer: __rcu annotations · b914c4ea

Eric Dumazet authored Oct 25, 2010

Adds __rcu annotations to inetpeer
	(struct inet_peer)->avl_left
	(struct inet_peer)->avl_right

This is a tedious cleanup, but removes one smp_wmb() from link_to_pool()
since we now use more self documenting rcu_assign_pointer().

Note the use of RCU_INIT_POINTER() instead of rcu_assign_pointer() in
all cases we dont need a memory barrier.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b914c4ea

fib_rules: __rcu annotates ctarget · 7a2b03c5

Eric Dumazet authored Oct 26, 2010

Adds __rcu annotation to (struct fib_rule)->ctarget
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7a2b03c5

tunnels: add __rcu annotations · b33eab08

Eric Dumazet authored Oct 25, 2010

Add __rcu annotations to :
        (struct ip_tunnel)->prl
        (struct ip_tunnel_prl_entry)->next
        (struct xfrm_tunnel)->next
	struct xfrm_tunnel *tunnel4_handlers
	struct xfrm_tunnel *tunnel64_handlers

And use appropriate rcu primitives to reduce sparse warnings if
CONFIG_SPARSE_RCU_POINTER=y
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b33eab08

net: add __rcu annotations to protocol · e0ad61ec

Eric Dumazet authored Oct 25, 2010

Add __rcu annotations to :
        struct net_protocol *inet_protos
        struct net_protocol *inet6_protos

And use appropriate casts to reduce sparse warnings if
CONFIG_SPARSE_RCU_POINTER=y
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e0ad61ec

ipv4: add __rcu annotations to routes.c · 1c31720a

Eric Dumazet authored Oct 25, 2010

Add __rcu annotations to :
        (struct dst_entry)->rt_next
        (struct rt_hash_bucket)->chain

And use appropriate rcu primitives to reduce sparse warnings if
CONFIG_SPARSE_RCU_POINTER=y
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1c31720a

qlge: bugfix: Restoring the vlan setting. · c1b60092

Ron Mercer authored Oct 27, 2010

Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c1b60092

be2net: Schedule/Destroy worker thread in probe()/remove() rather than open()/close() · f203af70

Somnath Kotur authored Oct 25, 2010

When async mcc compls are rcvd on an i/f that is down (and so interrupts are disabled)
they just lie unprocessed in the compl queue.The compl queue can eventually get filled
up and cause the BE to lock up.The fix is to use be_worker to reap mcc compls when the
i/f is down.be_worker is now launched in be_probe() and canceled in be_remove().
Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f203af70

ipv6: fix refcnt problem related to POSTDAD state · 853dc2e0

Ursula Braun authored Oct 24, 2010

After running this bonding setup script
    modprobe bonding miimon=100 mode=0 max_bonds=1
    ifconfig bond0 10.1.1.1/16
    ifenslave bond0 eth1
    ifenslave bond0 eth3
on s390 with qeth-driven slaves, modprobe -r fails with this message
    unregister_netdevice: waiting for bond0 to become free. Usage count = 1
due to twice detection of duplicate address.
Problem is caused by a missing decrease of ifp->refcnt in addrconf_dad_failure.
An extra call of in6_ifa_put(ifp) solves it.
Problem has been introduced with commit f2344a13.
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

853dc2e0