update documentation on mirred and IFB

About two more or so to complete these.. cheers, jamal Clean up some documentation on mirred and IFB

update documentation on mirred and IFB
About two more or so to complete these.. cheers, jamal Clean up some documentation on mirred and IFB
f649f592 · jamal · Stephen Hemminger · 1d35a127 · 1d35a127 · f649f592
Commit f649f592 authored Jul 18, 2006 by jamal Committed by Stephen Hemminger Aug 04, 2006
Hide whitespace changes
Inline Side-by-side

Showing with 90 additions and 201 deletions

doc/actions/dummy-README doc/actions/dummy-README +0 -155

doc/actions/ifb-README doc/actions/ifb-README +9 -39

doc/actions/mirred-usage doc/actions/mirred-usage +81 -7

No files found.
--- a/doc/actions/dummy-README
+++ b/doc/actions/dummy-README
-
-Advantage over current IMQ; cleaner in particular in in SMP;
-with a _lot_ less code.
-Old Dummy device functionality is preserved while new one only
-kicks in if you use actions.
-
-IMQ USES
--------
-As far as i know the reasons listed below is why people use IMQ. 
-It would be nice to know of anything else that i missed.
-
-1) qdiscs/policies that are per device as opposed to system wide.
-IMQ allows for sharing.
-
-2) Allows for queueing incoming traffic for shaping instead of
-dropping. I am not aware of any study that shows policing is 
-worse than shaping in achieving the end goal of rate control.
-I would be interested if anyone is experimenting.
-
-3) Very interesting use: if you are serving p2p you may wanna give 
-preference to your own localy originated traffic (when responses come back)
-vs someone using your system to do bittorent. So QoSing based on state
-comes in as the solution. What people did to achive this was stick
-the IMQ somewhere prelocal hook.
-I think this is a pretty neat feature to have in Linux in general.
-(i.e not just for IMQ).
-But i wont go back to putting netfilter hooks in the device to satisfy
-this.  I also dont think its worth it hacking dummy some more to be 
-aware of say L3 info and play ip rule tricks to achieve this.
--> Instead the plan is to have a contrack related action. This action will
-selectively either query/create contrack state on incoming packets. 
-Packets could then be redirected to dummy based on what happens -> eg 
-on incoming packets; if we find they are of known state we could send to 
-a different queue than one which didnt have existing state. This
-all however is dependent on whatever rules the admin enters.
-
-At the moment this function does not exist yet. I have decided instead
-of sitting on the patch to release it and then if theres pressure i will
-add this feature.
-
-What you can do with dummy currently with actions
--------------------------------------------------
-
-Lets say you are policing packets from alias 192.168.200.200/32
-you dont want those to exceed 100kbps going out.
-
-tc filter add dev eth0 parent 1: protocol ip prio 10 u32 \
-match ip src 192.168.200.200/32 flowid 1:2 \
-action police rate 100kbit burst 90k drop
-
-If you run tcpdump on eth0 you will see all packets going out
-with src 192.168.200.200/32 dropped or not
-Extend the rule a little to see only the ones that made it out:
-
-tc filter add dev eth0 parent 1: protocol ip prio 10 u32 \
-match ip src 192.168.200.200/32 flowid 1:2 \
-action police rate 10kbit burst 90k drop \
-action mirred egress mirror dev dummy0 
-
-Now fire tcpdump on dummy0 to see only those packets ..
-tcpdump -n -i dummy0 -x -e -t 
-
-Essentially a good debugging/logging interface.
-
-If you replace mirror with redirect, those packets will be
-blackholed and will never make it out. This redirect behavior
-changes with new patch (but not the mirror). 
-
-What you can do with the patch to provide functionality
-that most people use IMQ for below:
-
--------
-export TC="/sbin/tc"
-
-$TC qdisc add dev dummy0 root handle 1: prio 
-$TC qdisc add dev dummy0 parent 1:1 handle 10: sfq
-$TC qdisc add dev dummy0 parent 1:2 handle 20: tbf rate 20kbit buffer 1600 limit 3000
-$TC qdisc add dev dummy0 parent 1:3 handle 30: sfq                                
-$TC filter add dev dummy0 protocol ip pref 1 parent 1: handle 1 fw classid 1:1
-$TC filter add dev dummy0 protocol ip pref 2 parent 1: handle 2 fw classid 1:2
-
-ifconfig dummy0 up
-
-$TC qdisc add dev eth0 ingress
-
-# redirect all IP packets arriving in eth0 to dummy0 
-# use mark 1 --> puts them onto class 1:1
-$TC filter add dev eth0 parent ffff: protocol ip prio 10 u32 \
-match u32 0 0 flowid 1:1 \
-action ipt -j MARK --set-mark 1 \
-action mirred egress redirect dev dummy0
-
--------
-
-
-Run A Little test:
-
-from another machine ping so that you have packets going into the box:
-----
-[root@jzny action-tests]# ping 10.22
-PING 10.22 (10.0.0.22): 56 data bytes
-64 bytes from 10.0.0.22: icmp_seq=0 ttl=64 time=2.8 ms
-64 bytes from 10.0.0.22: icmp_seq=1 ttl=64 time=0.6 ms
-64 bytes from 10.0.0.22: icmp_seq=2 ttl=64 time=0.6 ms
-
--- 10.22 ping statistics ---
-3 packets transmitted, 3 packets received, 0% packet loss
-round-trip min/avg/max = 0.6/1.3/2.8 ms
-[root@jzny action-tests]# 
-----
-Now look at some stats:
-
---
-[root@jmandrake]:~# $TC -s filter show parent ffff: dev eth0
-filter protocol ip pref 10 u32 
-filter protocol ip pref 10 u32 fh 800: ht divisor 1 
-filter protocol ip pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1 
-  match 00000000/00000000 at 0
-        action order 1: tablename: mangle  hook: NF_IP_PRE_ROUTING 
-        target MARK set 0x1  
-        index 1 ref 1 bind 1 installed 4195sec  used 27sec 
-         Sent 252 bytes 3 pkts (dropped 0, overlimits 0) 
-
-        action order 2: mirred (Egress Redirect to device dummy0) stolen
-        index 1 ref 1 bind 1 installed 165 sec used 27 sec
-         Sent 252 bytes 3 pkts (dropped 0, overlimits 0) 
-
-[root@jmandrake]:~# $TC -s qdisc
-qdisc sfq 30: dev dummy0 limit 128p quantum 1514b 
- Sent 0 bytes 0 pkts (dropped 0, overlimits 0) 
-qdisc tbf 20: dev dummy0 rate 20Kbit burst 1575b lat 2147.5s 
- Sent 210 bytes 3 pkts (dropped 0, overlimits 0) 
-qdisc sfq 10: dev dummy0 limit 128p quantum 1514b 
- Sent 294 bytes 3 pkts (dropped 0, overlimits 0) 
-qdisc prio 1: dev dummy0 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
- Sent 504 bytes 6 pkts (dropped 0, overlimits 0) 
-qdisc ingress ffff: dev eth0 ---------------- 
- Sent 308 bytes 5 pkts (dropped 0, overlimits 0) 
-
-[root@jmandrake]:~# ifconfig dummy0
-dummy0    Link encap:Ethernet  HWaddr 00:00:00:00:00:00  
-          inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
-          UP BROADCAST RUNNING NOARP  MTU:1500  Metric:1
-          RX packets:6 errors:0 dropped:3 overruns:0 frame:0
-          TX packets:3 errors:0 dropped:0 overruns:0 carrier:0
-          collisions:0 txqueuelen:32 
-          RX bytes:504 (504.0 b)  TX bytes:252 (252.0 b)
-----
-
-Dummy continues to behave like it always did.
-You send it any packet not originating from the actions it will drop them.
-[In this case the three dropped packets were ipv6 ndisc].
-
-cheers,
-jamal
--- a/doc/actions/ifb-README
+++ b/doc/actions/ifb-README

+IFB is intended to replace IMQ.
 Advantage over current IMQ; cleaner in particular in in SMP;
 with a _lot_ less code.
-Old Dummy device functionality is preserved while new one only
-kicks in if you use actions.

-IMQ USES
--------
+Known IMQ/IFB USES
+------------------
+
 As far as i know the reasons listed below is why people use IMQ. 
 It would be nice to know of anything else that i missed.

 1) qdiscs/policies that are per device as opposed to system wide.
-IMQ allows for sharing.
+IFB allows for sharing.

 2) Allows for queueing incoming traffic for shaping instead of
 dropping. I am not aware of any study that shows policing is 
@@ -34,40 +34,11 @@ on incoming packets; if we find they are of known state we could send to
 a different queue than one which didnt have existing state. This
 all however is dependent on whatever rules the admin enters.

-At the moment this function does not exist yet. I have decided instead
-of sitting on the patch to release it and then if theres pressure i will
-add this feature.
-
-What you can do with ifb currently with actions
--------------------------------------------------
-
-Lets say you are policing packets from alias 192.168.200.200/32
-you dont want those to exceed 100kbps going out.
-
-tc filter add dev eth0 parent 1: protocol ip prio 10 u32 \
-match ip src 192.168.200.200/32 flowid 1:2 \
-action police rate 100kbit burst 90k drop
-
-If you run tcpdump on eth0 you will see all packets going out
-with src 192.168.200.200/32 dropped or not
-Extend the rule a little to see only the ones that made it out:
-
-tc filter add dev eth0 parent 1: protocol ip prio 10 u32 \
-match ip src 192.168.200.200/32 flowid 1:2 \
-action police rate 10kbit burst 90k drop \
-action mirred egress mirror dev ifb0 
-
-Now fire tcpdump on ifb0 to see only those packets ..
-tcpdump -n -i ifb0 -x -e -t 
-
-Essentially a good debugging/logging interface.
-
-If you replace mirror with redirect, those packets will be
-blackholed and will never make it out. This redirect behavior
-changes with new patch (but not the mirror). 
+At the moment this 3rd function does not exist yet. I have decided that
+instead of sitting on the patch for another year, to release it and then 
+if theres pressure i will add this feature.

-What you can do with the patch to provide functionality
-that most people use IMQ for below:
+An example, to provide functionality that most people use IMQ for below:

 --------
 export TC="/sbin/tc"
@@ -147,7 +118,6 @@ ifb0    Link encap:Ethernet  HWaddr 00:00:00:00:00:00
          RX bytes:504 (504.0 b)  TX bytes:252 (252.0 b)
 -----

-Dummy continues to behave like it always did.
 You send it any packet not originating from the actions it will drop them.
 [In this case the three dropped packets were ipv6 ndisc].


--- a/doc/actions/mirred-usage
+++ b/doc/actions/mirred-usage
@@ -12,12 +12,59 @@ ACTION := <mirror | redirect>
 INDEX is the specific policy instance id
 DEVICENAME is the devicename

+Direction Ingress is not supported at the moment. It will be in the
+future as well as mirror/redirecting to a socket.

 Mirroring essentially takes a copy of the packet whereas redirecting
 steals the packet and redirects to specified destination.

+What NOT to do if you dont want your machine to crash:
+------------------------------------------------------
+
+Do not create loops! 
+Loops are not hard to create in the egress qdiscs.
+
+Here are simple rules to follow if you dont want to get
+hurt:
+A) Do not have the same packet go to same netdevice twice
+in a single graph of policies. Your machine will just hang!
+This is design intent _not a bug_ to teach you some lessons. 
+
+In the future if there are easy ways to do this in the kernel
+without affecting other packets not interested in this feature
+I will add them. At the moment that is not clear.
+
+Some examples of bad things to do:
+1) redirecting eth0 to eth0
+2) eth0->eth1-> eth0
+3) eth0->lo-> eth1-> eth0
+
+B) Do not redirect from one IFB device to another.
+Remember that IFB is a very specialized case of packet redirecting
+device. Instead of redirecting it puts packets at the exact spot
+on the stack it found them from.
+This bad policy will actually not crash your machine but your 
+packets will all be dropped (this is much simpler to detect
+and resolve and is only affecting users of ifb as opposed to the
+whole stack).
+
+In the case of A) the problem has to do with a recursive contention
+for the devices queue lock and in the second case for the transmit lock.
+
 Some examples:
-Host A is hooked  up to us on eth0
+------------
+
+1) Mirror all packets arriving on eth0 to be sent out on eth1.
+You may have a sniffer or some accounting box hooked up on eth1.
+ 
+tc qdisc add dev lo eth0
+tc filter add dev eth0 parent ffff: protocol ip prio 10 u32 \
+match u32 0 0 flowid 1:2 action mirred egress mirror dev eth1
+
+If you replace "mirror" with "redirect" then not a copy but rather
+the original packet is sent to eth1.
+
+2) Host A is hooked  up to us on eth0

 tc qdisc add dev lo ingress
 # redirect all packets arriving on ingress of lo to eth0
@@ -28,7 +75,7 @@ On host A start a tcpdump on interface connecting to us.

 on our host ping -c 2 127.0.0.1

-Ping would fail sinc all packets are heading out eth0
+Ping would fail since all packets are heading out eth0
 tcpudmp on host A would show them

 if you substitute the redirect with mirror above as in:
@@ -38,7 +85,7 @@ match u32 0 0 flowid 1:2 action mirred egress mirror dev eth0
 Then you should see the packets on both host A and the local
 stack (i.e ping would work).

-Even more funky example:
+3) Even more funky example:

 #
 #allow 1 out 10 packets to randomly make it to the 
@@ -49,11 +96,10 @@ match u32 0 0 flowid 1:2 \
 action drop random determ ok 10\
 action mirred egress mirror dev eth0

------
-Example 2:
+4)
 # for packets coming from 10.0.0.9:
-#Redirect packets on egress (to ISP A) if you exceed a certain rate
-# to eth1 (to ISP B) if you exceed a certain rate
+#Redirect packets on egress, if exceeding a 100Kbps rate,
+# to eth1 
 #

 tc qdisc add dev eth0 handle 1:0 root prio
@@ -69,3 +115,31 @@ A more interesting example is when you mirror flows to a dummy device
 so you could tcpdump them (dummy by defaults drops all packets it sees).
 This is a very useful debug feature.

+Lets say you are policing packets from alias 192.168.200.200/32
+you dont want those to exceed 100kbps going out.
+
+tc filter add dev eth0 parent 1: protocol ip prio 10 u32 \
+match ip src 192.168.200.200/32 flowid 1:2 \
+action police rate 100kbit burst 90k drop
+
+If you run tcpdump on eth0 you will see all packets going out
+with src 192.168.200.200/32 dropped or not
+Extend the rule a little to see only the ones that made it out:
+
+tc filter add dev eth0 parent 1: protocol ip prio 10 u32 \
+match ip src 192.168.200.200/32 flowid 1:2 \
+action police rate 10kbit burst 90k drop \
+action mirred egress mirror dev dummy0
+
+Now fire tcpdump on dummy0 to see only those packets ..
+tcpdump -n -i dummy0 -x -e -t
+
+Essentially a good debugging/logging interface (sort of like
+BSDs speacialized log device does without needing one).
+
+If you replace mirror with redirect, those packets will be
+blackholed and will never make it out. This redirect behavior
+changes with new patch (but not the mirror).
+
+cheers,
+jamal