- 28 Jan, 2008 40 commits
-
-
Denis V. Lunev authored
This makes the code in the inet6_rt_notify more straightforward and provides groud for namespace passing. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Herbert Xu authored
Further testing shows that my ICMP relookup patch can cause xfrm_lookup to return zero on error which isn't very nice since it leads to the caller dying on null pointer dereference. The bug is due to not setting err to ENOENT just before we leave xfrm_lookup in case of no policy. This patch moves the err setting to where it should be. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
This implements [RFC 4340, p. 32]: "any feature negotiation options received on DCCP-Data packets MUST be ignored". Also added a FIXME for further processing, since the code currently (wrongly) classifies empty Confirm options as invalid - this needs to be resolved in a separate patch. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
This removes several `XXX' references which indicate a missing support for non-1-byte feature values: this is unnecessary, as all currently known (standardised) SP feature values are 1-byte quantities. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
This removes two inlines which were both called in a single function only: 1) dccp_feat_change() is always called with either DCCPO_CHANGE_L or DCCPO_CHANGE_R as argument * from dccp_set_socktopt_change() via do_dccp_setsockopt() with DCCP_SOCKOPT_CHANGE_R/L * from __dccp_feat_init() via dccp_feat_init() also with DCCP_SOCKOPT_CHANGE_R/L. Hence the dccp_feat_is_valid_type() is completely unnecessary and always returns true. 2) Due to (1), the length test reduces to 'len >= 4', which in turn makes dccp_feat_is_valid_length() unnecessary. Furthermore, the inline function dccp_feat_is_reserved() was unfolded, since only called in a single place. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
This provides a separate routine to insert options during the initial handshake. The main purpose is to conduct feature negotiation, for the moment the only user is the timestamp echo needed for the (CCID3) handshake RTT sample. Padding of options has been put into a small separate routine, to be shared among the two functions. This could also be used as a generic routine to finish inserting options. Also removed an `XXX' comment since its content was obvious. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
In DCCP, timestamps can occur on packets anytime, CCID3 uses a timestamp(/echo) on the Request/Response exchange. This patch addresses the following situation: * timestamps are recorded on the listening socket; * Responses are sent from dccp_request_sockets; * suppose two connections reach the listening socket with very small time in between: * the first timestamp value gets overwritten by the second connection request. This is not really good, so this patch separates timestamps into * those which are received by the server during the initial handshake (on dccp_request_sock); * those which are received by the client or the client after connection establishment. As before, a timestamp of 0 is regarded as indicating that no (meaningful) timestamp has been received (in addition, a warning message is printed if hosts send 0-valued timestamps). The timestamp-echoing now works as follows: * when a timestamp is present on the initial Request, it is placed into dreq, due to the call to dccp_parse_options in dccp_v{4,6}_conn_request; * when a timestamp is present on the Ack leading from RESPOND => OPEN, it is copied over from the request_sock into the child cocket in dccp_create_openreq_child; * timestamps received on an (established) dccp_sock are treated as before. Since Elapsed Time is measured in hundredths of milliseconds (13.2), the new dccp_timestamp() function is used, as it is expected that the time between receiving the timestamp and sending the timestamp echo will be very small against the wrap-around time. As a byproduct, this allows smaller timestamping-time fields. Furthermore, inserting the Timestamp Echo option has been taken out of the block starting with '!dccp_packet_without_ack()', since Timestamp Echo can be carried on any packet (5.8 and 13.3). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
This adds option-parsing code to processing of Acks in the listening state on request_socks on the server, serving two purposes (i) resolves a FIXME (removed); (ii) paves the way for feature-negotiation during connection-setup. There is an intended subtlety here with regard to dccp_check_req: Parsing options happens only after testing whether the received packet is a retransmitted Request. Otherwise, if the Request contained (a possibly large number of) feature-negotiation options, recomputing state would have to happen each time a retransmitted Request arrives, which opens the door to an easy DoS attack. Since in a genuine retransmission the options should not be different from the original, reusing the already computed state seems better. The other point is - if there are timestamp options on the Request, they will not be answered; which means that in the presence of retransmission (likely due to loss and/or other problems), the use of Request/Response RTT sampling is suspended, so that startup problems here do not propagate. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
The option parsing code currently only parses on full sk's. This causes a problem for options sent during the initial handshake (in particular timestamps and feature-negotiation options). Therefore, this patch extends the option parsing code with an additional argument for request_socks: if it is non-NULL, options are parsed on the request socket, otherwise the normal path (parsing on the sk) is used. Subsequent patches, which implement feature negotiation during connection setup, make use of this facility. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
This replaces 4 individual assignments for `len' with a single one, placed where the control flow of those 4 leads to. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
This adds a socket option and signalling support for the case where the server holds timewait state on closing the connection, as described in RFC 4340, 8.3. Since holding timewait state at the server is the non-usual case, it is enabled via a socket option. Documentation for this socket option has been added. The setsockopt statement has been made resilient against different possible cases of expressing boolean `true' values using a suggestion by Ian McDonald. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
This removes another Fixme, using the TCP maximum RTO rather than the value specified by the DCCP specification. Across the sections in RFC 4340, 64 seconds is consistently suggested as maximum RTO backoff value; and this is the value which is now used. I have checked both termination cases for retransmissions of Close/CloseReq: with the default value 15 of `retries2', and an initial icsk_retransmit = 0, it takes about 614 seconds to declare a non-responding peer as dead, after which the final terminating Reset is sent. With the TCP maximum RTO value of 120 seconds it takes (as might be expected) almost twice as long, about 23 minutes. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
When performing active close, RFC 4340, 8.3. requires to retransmit the Close/CloseReq with a backoff-retransmit timer starting at intially 2 RTTs. This patch shifts the existing code for active-close retransmit timer into output.c, so that the retransmit timer is started when the first Close/CloseReq is sent. Previously, the timer was started when, after releasing the socket in dccp_close(), the actively-closing side had not yet reached the CLOSED/TIMEWAIT state. The patch further reduces the initial timeout from 3 seconds to the required 2 RTTs, where - in absence of a known RTT - the fallback value specified in RFC 4340, 3.4 is used. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Lezcano authored
Removed useless and buggy __exit section in the different ipv6 subsystems. Otherwise they will be called inside an init section during rollbacking in case of an error in the protocol initialization. Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
This patch performs two changes: 1) Close the write-end in addition to the read-end when a fin-like segment (Close or CloseReq) is received by DCCP. This accounts for the fact that DCCP, in contrast to TCP, does not have a half-close. RFC 4340 says in this respect that when a fin-like segment has been sent there is no guarantee at all that any further data will be processed. Thus this patch performs SHUT_WR in addition to the SHUT_RD when a fin-like segment is encountered. 2) Minor change: I noted that code appears twice in different places and think it makes sense to put this into a self-contained function (dccp_enqueue()). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Herbert Xu authored
My previous patch made the wait flag take the opposite value to what it should be. This patch fixes that. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Herbert Xu authored
The caller must hold the RTNL so let's check it in unregister_netdevice. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Herbert Xu authored
This fixes a logical error in ICMP policy checks which lets packets through if the state ICMP flag is off. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Herbert Xu authored
This patch converts all callers of xfrm_lookup that used an explicit value of 1 to indiciate blocking to use the new flag XFRM_LOOKUP_WAIT. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Herbert Xu authored
The policy check I added for ICMP on IPv6 is reversed. This patch fixes that. It also adds an skb->sp check so that unprotected packets that fail the policy check do not crash the machine. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Change bnx2_init_napi() to void. Warning was noted by DaveM. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Enable new tx ring and add new MSIX handler and NAPI poll function for the new tx ring. Enable MSIX when the hardware supports it. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
To separate TX IRQs into a different MSIX vector, we need to support a new tx ring. The original tx ring will still be used when not using MSIX. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Change bnx2_napi struct into an array and add code to manage multiple IRQs. MSIX hardware structures and new registers are also added. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Rx related fields used in NAPI polling are moved from the main bnx2 struct to the bnx2_napi struct. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Tx related fields used in NAPI polling are moved from the main bnx2 struct to the bnx2_napi struct. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Introduce a bnx2_napi structure that will hold a napi_struct and other fields to handle NAPI polling for the napi_struct. Various tx and rx indexes and status block pointers will be moved from the main bnx2 structure to this bnx2_napi structure. Most NAPI path functions are modified to be passed this bnx2_napi struct pointer. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Add a table to keep track of multiple IRQs and restructure the IRQ request and free functions so that they can be easily expanded to handle multiple IRQs. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
This makes the code cleaner and easier to support different tx rings. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
If the MTU requires more than 1 page for the SKB, enable the page ring and calculate the size of the page ring. This will guarantee order-0 allocation regardless of the MTU size. Fixup loopback test packet size so that we don't deal with the pages during loopback test. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Add function to reuse a page in case of allocation or other errors. Add code to construct the completed SKB with the additional data in the pages. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Add new fields to keep track of the pages and the page rings. Add functions to allocate and free pages. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Factor out the common functions that will be used to initialize the normal RX rings and the page rings. Change the copybreak constant RX_COPY_THRESH to 128. This same constant will be used for the max. size of the linear SKB when pages are used. Copybreak will be turned off when pages are used. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Add a new function to handle new SKB allocation and to prepare the completed SKB. This makes it easier to add support for non-linear SKB. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Define the various ring constants to make the code cleaner. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew Morton authored
Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michal Schmidt authored
Once created, an IP tunnel can't be bound to another device. (reported as https://bugzilla.redhat.com/show_bug.cgi?id=419671) To reproduce: # create a tunnel: ip tunnel add tunneltest0 mode ipip remote 10.0.0.1 dev eth0 # try to change the bounding device from eth0 to eth1: ip tunnel change tunneltest0 dev eth1 # show the result: ip tunnel show tunneltest0 tunneltest0: ip/ip remote 10.0.0.1 local any dev eth0 ttl inherit Notice the bound device has not changed from eth0 to eth1. This patch fixes it. When changing the binding, it also recalculates the MTU according to the new bound device's MTU. If the change is acceptable, I'll do the same for GRE and SIT tunnels. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-