Commits · 4d3c46e6833208428d366630aa708f6876e61fc1 · Kirill Smelkov / linux

04 Sep, 2009 40 commits

sctp: drop a_rwnd to 0 when receive buffer overflows. · 4d3c46e6

Vlad Yasevich authored Sep 04, 2009

SCTP has a problem that when small chunks are used, it is possible
to exhaust the receiver buffer without fully closing receive window.
This happens due to all overhead that we have account for with small
messages. To fix this, when receive buffer is exceeded, we'll drop
the window to 0 and save the 'drop' portion. When application starts
reading data and freeing up recevie buffer space, we'll wait until
we've reached the 'drop' window and then add back this 'drop' one
mtu at a time. This worked well in testing and under stress produced
rather even recovery.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>

4d3c46e6

sctp: Clear fast_recovery on the transport when T3 timer expires. · 33ce8281

Vlad Yasevich authored Sep 04, 2009

If T3 timer expires, we are retransmitting data due to timeout any
any fast recovery is null and void.  We can clear the fast recovery
flag.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>

33ce8281

sctp: Fix error count increments that were results of HEARTBEATS · b9f84786

Vlad Yasevich authored Aug 26, 2009

SCTP RFC 4960 states that unacknowledged HEARTBEATS count as
errors agains a given transport or endpoint.  As such, we
should increment the error counts for only for unacknowledged
HB, otherwise we detect failure too soon.  This goes for both
the overall error count and the path error count.

Now, there is a difference in how the detection is done
between the two.  The path error detection is done after
the increment, so to detect it properly, we actually need
to exceed the path threshold.  The overall error detection
is done _BEFORE_ the increment.  Thus to detect the failure,
it's enough for the error count to match the threshold.
This is why all the state functions use '>=' to detect failure,
while path detection uses '>'.

Thanks goes to Chunbo Luo <chunbo.luo@windriver.com> who first
proposed patches to fix this issue and made me re-read the spec
and the code to figure out how this cruft really works.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>

b9f84786

sctp: use proc_create() · d71a09ed

Alexey Dobriyan authored Aug 23, 2009

create_proc_entry() is deprecated (not formally, though).
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>

d71a09ed

sctp: fix check the chunk length of received HEARTBEAT-ACK chunk · dadb50cc

Wei Yongjun authored Aug 22, 2009

The receiver of the HEARTBEAT should respond with a HEARTBEAT ACK
that contains the Heartbeat Information field copied from the
received HEARTBEAT chunk. So the received HEARTBEAT-ACK chunk
must have a length of:
  sizeof(sctp_chunkhdr_t) + sizeof(sctp_sender_hb_info_t)

A badly formatted HB-ACK chunk, it is possible that we may access
invalid memory.  We should really make sure that the chunk format
is what we expect, before attempting to touch the data.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>

dadb50cc

sctp: drop SHUTDOWN chunk if the TSN is less than the CTSN · a2f36eec

Wei Yongjun authored Aug 22, 2009

If Cumulative TSN Ack field of SHUTDOWN chunk is less than the
Cumulative TSN Ack Point then drop the SHUTDOWN chunk.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>

a2f36eec

sctp: Send user messages to the lower layer as one · 9c5c62be

Vlad Yasevich authored Aug 10, 2009

Currenlty, sctp breaks up user messages into fragments and
sends each fragment to the lower layer by itself.  This means
that for each fragment we go all the way down the stack
and back up.  This also discourages bundling of multiple
fragments when they can fit into a sigle packet (ex: due
to user setting a low fragmentation threashold).

We introduce a new command SCTP_CMD_SND_MSG and hand the
whole message down state machine.  The state machine and
the side-effect parser will cork the queue, add all chunks
from the message to the queue, and then un-cork the queue
thus causing the chunks to get transmitted.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>

9c5c62be

sctp: Try to encourage SACK bundling with DATA. · 5d7ff261

Vlad Yasevich authored Aug 07, 2009

If the association has a SACK timer pending and now DATA queued
to be send, we'll try to bundle the SACK with the next application send.
As such, try encourage bundling by accounting for SACK in the size
of the first chunk fragment.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>

5d7ff261

sctp: Generate SACKs when actually sending outbound DATA · e83963b7

Vlad Yasevich authored Aug 07, 2009

We are now trying to bundle SACKs when we have outbound
DATA to send.  However, there are situations where this
outbound DATA will not be sent (due to congestion or 
available window).  In such cases it's ok to wait for the
timer to expire.  This patch refactors the sending code
so that betfore attempting to bundle the SACK we check
to see if the DATA will actually be transmitted.

Based on eirlier works for Doug Graham <dgraham@nortel.com> and
Wei Youngjun <yjwei@cn.fujitsu.com>.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>

e83963b7

sctp: Fix data segmentation with small frag_size · 3e62abf9

Vlad Yasevich authored Sep 04, 2009

Since an application may specify the maximum SCTP fragment size
that all data should be fragmented to, we need to fix how
we do segmentation.   Right now, if a user specifies a small
fragment size, the segment size can go negative in the presence
of AUTH or COOKIE_ECHO bundling.

What we need to do is track the largest possbile DATA chunk that
can fit into the mtu.  Then if the fragment size specified is
bigger then this maximum length, we'll shrink it down.  Otherwise,
we just use the smaller segment size without changing it further.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>

3e62abf9

sctp: Disallow new connection on a closing socket · bec9640b

Vlad Yasevich authored Jul 30, 2009

If a socket has a lot of association that are in the process of
of being closed/aborted, it is possible for a remote to establish
new associations during the time period that the old ones are shutting
down. If this was a result of a close() call, there will be no socket
and will cause a memory leak. We'll prevent this by setting the
socket state to CLOSING and disallow new associations when in this state.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>

bec9640b

sctp: Fix piggybacked ACKs · af87b823

Doug Graham authored Jul 29, 2009

This patch corrects the conditions under which a SACK will be piggybacked
on a DATA packet. The previous condition was incorrect due to a
misinterpretation of RFC 4960 and/or RFC 2960. Specifically, the
following paragraph from section 6.2 had not been implemented correctly:

Before an endpoint transmits a DATA chunk, if any received DATA
chunks have not been acknowledged (e.g., due to delayed ack), the
sender should create a SACK and bundle it with the outbound DATA
chunk, as long as the size of the final SCTP packet does not exceed
the current MTU. See Section 6.2.

When about to send a DATA chunk, the code now checks to see if the SACK
timer is running. If it is, we know we have a SACK to send to the
peer, so we append the SACK (assuming available space in the packet)
and turn off the timer. For a simple request-response scenario, this
will result in the SACK being bundled with the response, meaning the
the SACK is received quickly by the client, and also meaning that no
separate SACK packet needs to be sent by the server to acknowledge the
request. Prior to this patch, a separate SACK packet would have been
sent by the server SCTP only after its delayed-ACK timer had expired
(usually 200ms). This is wasteful of bandwidth, and can also have a
major negative impact on performance due the interaction of delayed ACKs
with the Nagle algorithm.
Signed-off-by: Doug Graham <dgraham@nortel.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>

af87b823

sctp: remove unused union (sctp_cmsg_data_t) definition · b4e8c6a7

Rami Rosen authored Jul 30, 2009

This patch removes an unused union definition (sctp_cmsg_data_t)
from include/net/sctp/user.h.
Signed-off-by: Rami Rosen <rosenrami@gmail.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>

b4e8c6a7

sctp: release cached route when the transport goes down. · 40187886

Vlad Yasevich authored Jun 23, 2009

When the sctp transport is marked down, we can release the
cached route and force a new lookup when attempting to use
this transport for anything.  This way, if a better route
or source address is available, we'll try to use it.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>

40187886

sctp: update the route for non-active transports after addresses are added · 3cd9749c

Wei Yongjun authored Jun 16, 2009

Update the route and saddr entries for the non-active transports as some
of the added addresses can be used as better source addresses, or may
be there is a better route.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>

3cd9749c

sctp: check the unrecognized ASCONF parameter before access it · 44e65c1e

Wei Yongjun authored Jun 16, 2009

This patch fix to check the unrecognized ASCONF parameter before
access it.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>

44e65c1e

sctp: avoid overwrite the return value of sctp_process_asconf_ack() · 425e0f68

Wei Yongjun authored Jun 16, 2009

The return value of sctp_process_asconf_ack() may be
overwritten while process parameters with no error.
This patch fixed the problem.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>

425e0f68

net: Fix a build break because of a typo in drivers/net/3c503.c · 8a34e2f8
Sachin Sant authored Sep 04, 2009
```
Signed-off-by: Sachin Sant <sachinp@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
```
8a34e2f8

can: sja1000: legacy SJA1000 ISA bus driver · 2a6ba39a

Wolfgang Grandegger authored Sep 01, 2009

This patch adds support for legacy SJA1000 CAN controllers on the ISA
or PC-104 bus. The I/O port or memory address and the IRQ number must
be specified via module parameters:

  insmod sja1000_isa.ko port=0x310,0x380 irq=7,11

for ISA devices using I/O ports or:

  insmod sja1000_isa.ko mem=0xd1000,0xd1000 irq=7,11

for memory mapped ISA devices.

Indirect access via address and data port is supported as well:

  insmod sja1000_isa.ko port=0x310,0x380 indirect=1 irq=7,11

Here is a full list of the supported module parameters:

  port:I/O port number (array of ulong)
  mem:I/O memory address (array of ulong)
  indirect:Indirect access via address and data port (array of byte)
  irq:IRQ number (array of int)
  clk:External oscillator clock frequency (default=16000000 [16 MHz])
      (array of int)
  cdr:Clock divider register (default=0x48 [CDR_CBP | CDR_CLK_OFF])
      (array of byte)
  ocr:Output clock register (default=0x18 [OCR_TX0_PUSHPULL])
      (array of byte)

Note: for clk, cdr, ocr, the first argument re-defines the default
for all other devices, e.g.:

 insmod sja1000_isa.ko mem=0xd1000,0xd1000 irq=7,11 clk=24000000

is equivalent to

 insmod sja1000_isa.ko mem=0xd1000,0xd1000 irq=7,11 \
                       clk=24000000,24000000
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Tested-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

2a6ba39a

can: sja1000: fix network statistics update · 8935f57e

Wolfgang Grandegger authored Sep 01, 2009

The member "tx_bytes" of "struct net_device_stats" should be
incremented when the interrupt is done and an "arbitration
lost error" is a TX error and the statistics should be updated
accordingly.
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8935f57e

can: add can_free_echo_skb() for upcoming drivers · 39e3ab6f

Wolfgang Grandegger authored Sep 01, 2009

This patch adds the function can_free_echo_skb to the CAN
device interface to allow upcoming drivers to release echo
skb's in case of error.
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

39e3ab6f

WAN: dscc4: Fix warning pointing out a bug. · fab4e763

David S. Miller authored Sep 03, 2009

Noticed by Stephen Rothwell:

	Today's linux-next build (x86_64 allmodconfig gcc-4.4.0)
	produced this warning:

	drivers/net/wan/dscc4.c: In function 'dscc4_rx_skb':
	drivers/net/wan/dscc4.c:670: warning: suggest parentheses around comparison in operand of '|'

	which actually points out a bug, I think.  It is doing
		(x & (y | z)) != y | z
	when it probably means
		(x & (y | z)) != (y | z)

	Introduced by commit 5de3fcab
	("WAN: bit and/or confusion").
Signed-off-by: David S. Miller <davem@davemloft.net>

fab4e763

ipv6: Fix tcp_v6_send_response(): it didn't set skb transport header · a8fdf2b3

Cosmin Ratiu authored Sep 03, 2009

Here is a patch which fixes an issue observed when using TCP over IPv6
and AH from IPsec.

When a connection gets closed the 4-way method and the last ACK from
the server gets dropped, the subsequent FINs from the client do not
get ACKed because tcp_v6_send_response does not set the transport
header pointer. This causes ah6_output to try to allocate a lot of
memory, which typically fails, so the ACKs never make it out of the
stack.

I have reproduced the problem on kernel 2.6.7, but after looking at
the latest kernel it seems the problem is still there.
Signed-off-by: Cosmin Ratiu <cratiu@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a8fdf2b3

enic: organize device initialization/deinit into separate functions · 6fdfa970

Scott Feldman authored Sep 03, 2009

To unclutter probe() a little bit, put all device initialization code
in one spot and device deinit code in another spot.  Also remove unused
rq->buf_index variable/func.
Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6fdfa970

enic: bug fix: check for zero port MTU before posting warning · 491598a4

Scott Feldman authored Sep 03, 2009

Nic firmware can return zero for port MTU, so check for non-zero value
before checking for change in port MTU.
Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

491598a4

enic: changes to driver/firmware interface · d73149f5

Scott Feldman authored Sep 03, 2009

Deprecate some old APIa; change arguments to stats dump all API; add new
interrupt assert API
Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d73149f5

enic: bug fix: enable VLAN filtering · 9f63a7c6

Scott Feldman authored Sep 03, 2009

Bug fix: enable VLAN filtering
Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9f63a7c6

enic: provision for multiple Rx/Tx queues; prepare for RSS support · 6ba9cdc0

Scott Feldman authored Sep 03, 2009

Provision for multiple Rx/Tx queues.  Max of 8 WQs and 8 RQs.  Max for
completion queue is 8+8=16 and max for interrupt resources is 8+8+2.

Add driver/firmware interface for setting up RSS secret key and indirection
table.
Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6ba9cdc0

enic: bug fix: included MAC drops in rx_dropped netstat · 350991e1

Scott Feldman authored Sep 03, 2009

Bug fix: included MAC drops in rx_dropped netstat.  Also track Rx trunctations
stat at the MAC
Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

350991e1

enic: bug fix: protect fw call i/f with spinlock · 56ac88b3

Scott Feldman authored Sep 03, 2009

Some driver -> nic firmware calls weren't guarded with a spinlock, exposing
the call i/f to a race between two threads
Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

56ac88b3

enic: use netdev_alloc_skb · d19e22dc

Scott Feldman authored Sep 03, 2009

Use netdev_alloc_skb rather than dev_alloc_skb
Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d19e22dc

enic: bug fix: split TSO fragments larger than 16K into multiple descs · ea0d7d91

Scott Feldman authored Sep 03, 2009

enic WQ desc supports a maximum 16K buf size, so split any send fragments
larger than 16K into several descs.
Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ea0d7d91

enic: workaround A0 erratum · 4badc385

Scott Feldman authored Sep 03, 2009

A0 revision ASIC has an erratum on the RQ desc cache on chip where the
cache can become corrupted causing pkt buf writes to wrong locations. The s/w
workaround is to post a dummy RQ desc in the ring every 32 descs, causing a
flush of the cache. A0 parts are not production, but there are enough of
these parts in the wild in test setups to warrant including workaround. A1
revision ASIC parts fix erratum.
Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4badc385

enic: add support for multiple BARs · 27e6c7d3

Scott Feldman authored Sep 03, 2009

Nic firmware can place resources (queues, intrs, etc) on multiple BARs, so
allow driver to discover/map resources beyond BAR0.
Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

27e6c7d3

vlan: adds drops accounting · 1a123a31

Eric Dumazet authored Sep 03, 2009

Its hard to tell if vlans are dropping frames, since
every frame given to vlan_???_start_xmit() functions
is accounted as fully transmitted by lower device.

We can test dev_queue_xmit() return values to
properly account for dropped frames.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1a123a31

macvlan: add multiqueue capability · 2c114553

Eric Dumazet authored Sep 03, 2009

macvlan devices are currently not multi-queue capable.

We can do that defining rtnl_link_ops method,
get_tx_queues(), called from rtnl_create_link()

This new method gets num_tx_queues/real_num_tx_queues
from lower device.

macvlan_get_tx_queues() is a copy of vlan_get_tx_queues().

Because macvlan_start_xmit() has to update netdev_queue
stats only (and not dev->stats), I chose to change
tx_errors/tx_aborted_errors accounting to tx_dropped,
since netdev_queue structure doesnt define tx_errors /
tx_aborted_errors.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2c114553

netdev: Convert MDIO ioctl implementation to use struct mii_ioctl_data · 0fa0ee05

Ben Hutchings authored Sep 03, 2009

A few drivers still access the arguments to MDIO ioctls as an array of
u16.  Convert them to use struct mii_ioctl_data.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0fa0ee05

netdev: Remove redundant checks for CAP_NET_ADMIN in MDIO implementations · 7ab0f273

Ben Hutchings authored Sep 03, 2009

dev_ioctl() already checks capable(CAP_NET_ADMIN) before calling the
driver's implementation of MDIO ioctls.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7ab0f273

netdev: Remove SIOCDEVPRIVATE aliases for MDIO ioctls · aae5e7c3

Ben Hutchings authored Sep 03, 2009

The standard MDIO ioctl numbers are well-established and these should
no longer be needed.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

aae5e7c3

sky2: only enable Vaux if capable of wakeup · c23ddf8f

Stephen Hemminger authored Sep 03, 2009

While perusing vendor driver, I saw that it did not enable the Vaux
power unless device was able to wake from lan for D3cold.
This might help for Rene's power issue.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c23ddf8f