Commit 1b627d17 authored by Linus Torvalds's avatar Linus Torvalds

Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (170 commits)
  commit 3d9dd756
  Author: Zach Brown <zach.brown@oracle.com>
  Date:   Fri Apr 14 16:04:18 2006 -0700
  
      [PATCH] ip_output: account for fraggap when checking to add trailer_len
      
      During other work I noticed that ip_append_data() seemed to be forgetting to
      include the frag gap in its calculation of a fragment that consumes the rest of
      the payload.  Herbert confirmed that this was a bug that snuck in during a
      previous rework.
Signed-off-by: default avatarZach Brown <zach.brown@oracle.com>
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  
  commit 08d09997
  Author: Linus Walleij <triad@df.lth.se>
  Date:   Fri Apr 14 16:03:33 2006 -0700
  
      [IRDA]: smsc-ircc2, smcinit support for ALi ISA bridges
      
  ...
parents f2f4d9e8 3d9dd756
The sync patches work is based on initial patches from
Krisztian <hidden@balabit.hu> and others and additional patches
from Jamal <hadi@cyberus.ca>.
The end goal for syncing is to be able to insert attributes + generate
events so that the an SA can be safely moved from one machine to another
for HA purposes.
The idea is to synchronize the SA so that the takeover machine can do
the processing of the SA as accurate as possible if it has access to it.
We already have the ability to generate SA add/del/upd events.
These patches add ability to sync and have accurate lifetime byte (to
ensure proper decay of SAs) and replay counters to avoid replay attacks
with as minimal loss at failover time.
This way a backup stays as closely uptodate as an active member.
Because the above items change for every packet the SA receives,
it is possible for a lot of the events to be generated.
For this reason, we also add a nagle-like algorithm to restrict
the events. i.e we are going to set thresholds to say "let me
know if the replay sequence threshold is reached or 10 secs have passed"
These thresholds are set system-wide via sysctls or can be updated
per SA.
The identified items that need to be synchronized are:
- the lifetime byte counter
note that: lifetime time limit is not important if you assume the failover
machine is known ahead of time since the decay of the time countdown
is not driven by packet arrival.
- the replay sequence for both inbound and outbound
1) Message Structure
----------------------
nlmsghdr:aevent_id:optional-TLVs.
The netlink message types are:
XFRM_MSG_NEWAE and XFRM_MSG_GETAE.
A XFRM_MSG_GETAE does not have TLVs.
A XFRM_MSG_NEWAE will have at least two TLVs (as is
discussed further below).
aevent_id structure looks like:
struct xfrm_aevent_id {
struct xfrm_usersa_id sa_id;
__u32 flags;
};
xfrm_usersa_id in this message layout identifies the SA.
flags are used to indicate different things. The possible
flags are:
XFRM_AE_RTHR=1, /* replay threshold*/
XFRM_AE_RVAL=2, /* replay value */
XFRM_AE_LVAL=4, /* lifetime value */
XFRM_AE_ETHR=8, /* expiry timer threshold */
XFRM_AE_CR=16, /* Event cause is replay update */
XFRM_AE_CE=32, /* Event cause is timer expiry */
XFRM_AE_CU=64, /* Event cause is policy update */
How these flags are used is dependent on the direction of the
message (kernel<->user) as well the cause (config, query or event).
This is described below in the different messages.
The pid will be set appropriately in netlink to recognize direction
(0 to the kernel and pid = processid that created the event
when going from kernel to user space)
A program needs to subscribe to multicast group XFRMNLGRP_AEVENTS
to get notified of these events.
2) TLVS reflect the different parameters:
-----------------------------------------
a) byte value (XFRMA_LTIME_VAL)
This TLV carries the running/current counter for byte lifetime since
last event.
b)replay value (XFRMA_REPLAY_VAL)
This TLV carries the running/current counter for replay sequence since
last event.
c)replay threshold (XFRMA_REPLAY_THRESH)
This TLV carries the threshold being used by the kernel to trigger events
when the replay sequence is exceeded.
d) expiry timer (XFRMA_ETIMER_THRESH)
This is a timer value in milliseconds which is used as the nagle
value to rate limit the events.
3) Default configurations for the parameters:
----------------------------------------------
By default these events should be turned off unless there is
at least one listener registered to listen to the multicast
group XFRMNLGRP_AEVENTS.
Programs installing SAs will need to specify the two thresholds, however,
in order to not change existing applications such as racoon
we also provide default threshold values for these different parameters
in case they are not specified.
the two sysctls/proc entries are:
a) /proc/sys/net/core/sysctl_xfrm_aevent_etime
used to provide default values for the XFRMA_ETIMER_THRESH in incremental
units of time of 100ms. The default is 10 (1 second)
b) /proc/sys/net/core/sysctl_xfrm_aevent_rseqth
used to provide default values for XFRMA_REPLAY_THRESH parameter
in incremental packet count. The default is two packets.
4) Message types
----------------
a) XFRM_MSG_GETAE issued by user-->kernel.
XFRM_MSG_GETAE does not carry any TLVs.
The response is a XFRM_MSG_NEWAE which is formatted based on what
XFRM_MSG_GETAE queried for.
The response will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs.
*if XFRM_AE_RTHR flag is set, then XFRMA_REPLAY_THRESH is also retrieved
*if XFRM_AE_ETHR flag is set, then XFRMA_ETIMER_THRESH is also retrieved
b) XFRM_MSG_NEWAE is issued by either user space to configure
or kernel to announce events or respond to a XFRM_MSG_GETAE.
i) user --> kernel to configure a specific SA.
any of the values or threshold parameters can be updated by passing the
appropriate TLV.
A response is issued back to the sender in user space to indicate success
or failure.
In the case of success, additionally an event with
XFRM_MSG_NEWAE is also issued to any listeners as described in iii).
ii) kernel->user direction as a response to XFRM_MSG_GETAE
The response will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs.
The threshold TLVs will be included if explicitly requested in
the XFRM_MSG_GETAE message.
iii) kernel->user to report as event if someone sets any values or
thresholds for an SA using XFRM_MSG_NEWAE (as described in #i above).
In such a case XFRM_AE_CU flag is set to inform the user that
the change happened as a result of an update.
The message will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs.
iv) kernel->user to report event when replay threshold or a timeout
is exceeded.
In such a case either XFRM_AE_CR (replay exceeded) or XFRM_AE_CE (timeout
happened) is set to inform the user what happened.
Note the two flags are mutually exclusive.
The message will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs.
Exceptions to threshold settings
--------------------------------
If you have an SA that is getting hit by traffic in bursts such that
there is a period where the timer threshold expires with no packets
seen, then an odd behavior is seen as follows:
The first packet arrival after a timer expiry will trigger a timeout
aevent; i.e we dont wait for a timeout period or a packet threshold
to be reached. This is done for simplicity and efficiency reasons.
-JHS
...@@ -1815,14 +1815,14 @@ static int irda_usb_probe(struct usb_interface *intf, ...@@ -1815,14 +1815,14 @@ static int irda_usb_probe(struct usb_interface *intf,
self->needspatch = (ret < 0); self->needspatch = (ret < 0);
if (ret < 0) { if (ret < 0) {
printk("patch_device failed\n"); printk("patch_device failed\n");
goto err_out_4; goto err_out_5;
} }
/* replace IrDA class descriptor with what patched device is now reporting */ /* replace IrDA class descriptor with what patched device is now reporting */
irda_desc = irda_usb_find_class_desc (self->usbintf); irda_desc = irda_usb_find_class_desc (self->usbintf);
if (irda_desc == NULL) { if (irda_desc == NULL) {
ret = -ENODEV; ret = -ENODEV;
goto err_out_4; goto err_out_5;
} }
if (self->irda_desc) if (self->irda_desc)
kfree (self->irda_desc); kfree (self->irda_desc);
...@@ -1832,6 +1832,8 @@ static int irda_usb_probe(struct usb_interface *intf, ...@@ -1832,6 +1832,8 @@ static int irda_usb_probe(struct usb_interface *intf,
return 0; return 0;
err_out_5:
unregister_netdev(self->netdev);
err_out_4: err_out_4:
kfree(self->speed_buff); kfree(self->speed_buff);
err_out_3: err_out_3:
......
This diff is collapsed.
...@@ -10,8 +10,6 @@ ...@@ -10,8 +10,6 @@
extern struct neigh_table arp_tbl; extern struct neigh_table arp_tbl;
extern void arp_init(void); extern void arp_init(void);
extern int arp_rcv(struct sk_buff *skb, struct net_device *dev,
struct packet_type *pt, struct net_device *orig_dev);
extern int arp_find(unsigned char *haddr, struct sk_buff *skb); extern int arp_find(unsigned char *haddr, struct sk_buff *skb);
extern int arp_ioctl(unsigned int cmd, void __user *arg); extern int arp_ioctl(unsigned int cmd, void __user *arg);
extern void arp_send(int type, int ptype, u32 dest_ip, extern void arp_send(int type, int ptype, u32 dest_ip,
......
...@@ -143,6 +143,11 @@ struct xfrm_state ...@@ -143,6 +143,11 @@ struct xfrm_state
/* Replay detection state at the time we sent the last notification */ /* Replay detection state at the time we sent the last notification */
struct xfrm_replay_state preplay; struct xfrm_replay_state preplay;
/* internal flag that only holds state for delayed aevent at the
* moment
*/
u32 xflags;
/* Replay detection notification settings */ /* Replay detection notification settings */
u32 replay_maxage; u32 replay_maxage;
u32 replay_maxdiff; u32 replay_maxdiff;
...@@ -168,6 +173,9 @@ struct xfrm_state ...@@ -168,6 +173,9 @@ struct xfrm_state
void *data; void *data;
}; };
/* xflags - make enum if more show up */
#define XFRM_TIME_DEFER 1
enum { enum {
XFRM_STATE_VOID, XFRM_STATE_VOID,
XFRM_STATE_ACQ, XFRM_STATE_ACQ,
......
This diff is collapsed.
...@@ -928,7 +928,8 @@ static void parp_redo(struct sk_buff *skb) ...@@ -928,7 +928,8 @@ static void parp_redo(struct sk_buff *skb)
* Receive an arp request from the device layer. * Receive an arp request from the device layer.
*/ */
int arp_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt, struct net_device *orig_dev) static int arp_rcv(struct sk_buff *skb, struct net_device *dev,
struct packet_type *pt, struct net_device *orig_dev)
{ {
struct arphdr *arp; struct arphdr *arp;
...@@ -1417,7 +1418,6 @@ static int __init arp_proc_init(void) ...@@ -1417,7 +1418,6 @@ static int __init arp_proc_init(void)
EXPORT_SYMBOL(arp_broken_ops); EXPORT_SYMBOL(arp_broken_ops);
EXPORT_SYMBOL(arp_find); EXPORT_SYMBOL(arp_find);
EXPORT_SYMBOL(arp_rcv);
EXPORT_SYMBOL(arp_create); EXPORT_SYMBOL(arp_create);
EXPORT_SYMBOL(arp_xmit); EXPORT_SYMBOL(arp_xmit);
EXPORT_SYMBOL(arp_send); EXPORT_SYMBOL(arp_send);
......
...@@ -1556,7 +1556,6 @@ void __init devinet_init(void) ...@@ -1556,7 +1556,6 @@ void __init devinet_init(void)
#endif #endif
} }
EXPORT_SYMBOL(devinet_ioctl);
EXPORT_SYMBOL(in_dev_finish_destroy); EXPORT_SYMBOL(in_dev_finish_destroy);
EXPORT_SYMBOL(inet_select_addr); EXPORT_SYMBOL(inet_select_addr);
EXPORT_SYMBOL(inetdev_by_index); EXPORT_SYMBOL(inetdev_by_index);
......
...@@ -666,4 +666,3 @@ void __init ip_fib_init(void) ...@@ -666,4 +666,3 @@ void __init ip_fib_init(void)
} }
EXPORT_SYMBOL(inet_addr_type); EXPORT_SYMBOL(inet_addr_type);
EXPORT_SYMBOL(ip_rt_ioctl);
...@@ -43,8 +43,6 @@ struct inet_bind_bucket *inet_bind_bucket_create(kmem_cache_t *cachep, ...@@ -43,8 +43,6 @@ struct inet_bind_bucket *inet_bind_bucket_create(kmem_cache_t *cachep,
return tb; return tb;
} }
EXPORT_SYMBOL(inet_bind_bucket_create);
/* /*
* Caller must hold hashbucket lock for this tb with local BH disabled * Caller must hold hashbucket lock for this tb with local BH disabled
*/ */
...@@ -64,8 +62,6 @@ void inet_bind_hash(struct sock *sk, struct inet_bind_bucket *tb, ...@@ -64,8 +62,6 @@ void inet_bind_hash(struct sock *sk, struct inet_bind_bucket *tb,
inet_csk(sk)->icsk_bind_hash = tb; inet_csk(sk)->icsk_bind_hash = tb;
} }
EXPORT_SYMBOL(inet_bind_hash);
/* /*
* Get rid of any references to a local port held by the given sock. * Get rid of any references to a local port held by the given sock.
*/ */
......
...@@ -904,7 +904,7 @@ int ip_append_data(struct sock *sk, ...@@ -904,7 +904,7 @@ int ip_append_data(struct sock *sk,
* because we have no idea what fragment will be * because we have no idea what fragment will be
* the last. * the last.
*/ */
if (datalen == length) if (datalen == length + fraggap)
alloclen += rt->u.dst.trailer_len; alloclen += rt->u.dst.trailer_len;
if (transhdrlen) { if (transhdrlen) {
......
...@@ -4559,7 +4559,6 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb, ...@@ -4559,7 +4559,6 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb,
EXPORT_SYMBOL(sysctl_tcp_ecn); EXPORT_SYMBOL(sysctl_tcp_ecn);
EXPORT_SYMBOL(sysctl_tcp_reordering); EXPORT_SYMBOL(sysctl_tcp_reordering);
EXPORT_SYMBOL(sysctl_tcp_abc);
EXPORT_SYMBOL(tcp_parse_options); EXPORT_SYMBOL(tcp_parse_options);
EXPORT_SYMBOL(tcp_rcv_established); EXPORT_SYMBOL(tcp_rcv_established);
EXPORT_SYMBOL(tcp_rcv_state_process); EXPORT_SYMBOL(tcp_rcv_state_process);
......
...@@ -1859,5 +1859,4 @@ EXPORT_SYMBOL(tcp_proc_unregister); ...@@ -1859,5 +1859,4 @@ EXPORT_SYMBOL(tcp_proc_unregister);
#endif #endif
EXPORT_SYMBOL(sysctl_local_port_range); EXPORT_SYMBOL(sysctl_local_port_range);
EXPORT_SYMBOL(sysctl_tcp_low_latency); EXPORT_SYMBOL(sysctl_tcp_low_latency);
EXPORT_SYMBOL(sysctl_tcp_tw_reuse);
...@@ -59,9 +59,6 @@ int sysctl_tcp_tso_win_divisor = 3; ...@@ -59,9 +59,6 @@ int sysctl_tcp_tso_win_divisor = 3;
int sysctl_tcp_mtu_probing = 0; int sysctl_tcp_mtu_probing = 0;
int sysctl_tcp_base_mss = 512; int sysctl_tcp_base_mss = 512;
EXPORT_SYMBOL(sysctl_tcp_mtu_probing);
EXPORT_SYMBOL(sysctl_tcp_base_mss);
static void update_send_head(struct sock *sk, struct tcp_sock *tp, static void update_send_head(struct sock *sk, struct tcp_sock *tp,
struct sk_buff *skb) struct sk_buff *skb)
{ {
......
...@@ -805,16 +805,22 @@ void xfrm_replay_notify(struct xfrm_state *x, int event) ...@@ -805,16 +805,22 @@ void xfrm_replay_notify(struct xfrm_state *x, int event)
case XFRM_REPLAY_UPDATE: case XFRM_REPLAY_UPDATE:
if (x->replay_maxdiff && if (x->replay_maxdiff &&
(x->replay.seq - x->preplay.seq < x->replay_maxdiff) && (x->replay.seq - x->preplay.seq < x->replay_maxdiff) &&
(x->replay.oseq - x->preplay.oseq < x->replay_maxdiff)) (x->replay.oseq - x->preplay.oseq < x->replay_maxdiff)) {
if (x->xflags & XFRM_TIME_DEFER)
event = XFRM_REPLAY_TIMEOUT;
else
return; return;
}
break; break;
case XFRM_REPLAY_TIMEOUT: case XFRM_REPLAY_TIMEOUT:
if ((x->replay.seq == x->preplay.seq) && if ((x->replay.seq == x->preplay.seq) &&
(x->replay.bitmap == x->preplay.bitmap) && (x->replay.bitmap == x->preplay.bitmap) &&
(x->replay.oseq == x->preplay.oseq)) (x->replay.oseq == x->preplay.oseq)) {
x->xflags |= XFRM_TIME_DEFER;
return; return;
}
break; break;
} }
...@@ -825,8 +831,10 @@ void xfrm_replay_notify(struct xfrm_state *x, int event) ...@@ -825,8 +831,10 @@ void xfrm_replay_notify(struct xfrm_state *x, int event)
km_state_notify(x, &c); km_state_notify(x, &c);
if (x->replay_maxage && if (x->replay_maxage &&
!mod_timer(&x->rtimer, jiffies + x->replay_maxage)) !mod_timer(&x->rtimer, jiffies + x->replay_maxage)) {
xfrm_state_hold(x); xfrm_state_hold(x);
x->xflags &= ~XFRM_TIME_DEFER;
}
} }
EXPORT_SYMBOL(xfrm_replay_notify); EXPORT_SYMBOL(xfrm_replay_notify);
...@@ -836,10 +844,15 @@ static void xfrm_replay_timer_handler(unsigned long data) ...@@ -836,10 +844,15 @@ static void xfrm_replay_timer_handler(unsigned long data)
spin_lock(&x->lock); spin_lock(&x->lock);
if (xfrm_aevent_is_on() && x->km.state == XFRM_STATE_VALID) if (x->km.state == XFRM_STATE_VALID) {
if (xfrm_aevent_is_on())
xfrm_replay_notify(x, XFRM_REPLAY_TIMEOUT); xfrm_replay_notify(x, XFRM_REPLAY_TIMEOUT);
else
x->xflags |= XFRM_TIME_DEFER;
}
spin_unlock(&x->lock); spin_unlock(&x->lock);
xfrm_state_put(x);
} }
int xfrm_replay_check(struct xfrm_state *x, u32 seq) int xfrm_replay_check(struct xfrm_state *x, u32 seq)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment