1. 16 May, 2016 6 commits
    • Daniel Borkmann's avatar
      ingress, clsact: don't add TCA_OPTIONS to nl msg · a2de651e
      Daniel Borkmann authored
      In ingress and clsact qdisc TCA_OPTIONS are ignored, since it's
      parameterless. In tc, we add an empty addattr_l(... TCA_OPTIONS,
      NULL, 0) to the netlink message nevertheless. This has the
      side effect that when someone tries a 'tc qdisc replace' and
      already an existing such qdisc is present, tc fails with
      EINVAL here.
      
      Reason is that in the kernel, this invokes qdisc_change() when
      such requested qdisc is already present. When TCA_OPTIONS are
      passed to modify parameters, it looks whether qdisc implements
      .change() callback, and if not present (like in both cases here)
      it returns with error. Rather than adding an empty stub to the
      kernel that ignores TCA_OPTIONS again, just don't add TCA_OPTIONS
      to the netlink message in the first place.
      
      Before:
      
        # tc qdisc replace dev foo clsact    # first try
        # tc qdisc replace dev foo clsact    # second one
        RTNETLINK answers: Invalid argument
      
      After:
      
        # tc qdisc replace dev foo clsact
        # tc qdisc replace dev foo clsact
        # tc qdisc replace dev foo clsact
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      a2de651e
    • Stephen Hemminger's avatar
      Merge branch 'master' into net-next · 866f6d77
      Stephen Hemminger authored
      866f6d77
    • Jamal Hadi Salim's avatar
      tc simple action update and breakage · fdf1bdd0
      Jamal Hadi Salim authored
      Brings it closer to more serious actions (adding branching
      and allowing for late binding)
      
      Unfortunately this breaks old syntax of the simple action.
      But because simple is a pedagogical example unlikely to be used
      in production environments (i.e its role is to serve as an example
      on how to write actions), then this is ok.
      
      New syntax for simple has new keyword "sdata". Example usage is:
      
      sudo tc actions add action simple sdata "foobar" index 1
      or
      tc filter add dev $DEV parent ffff: protocol ip prio 1 u32\
      match ip dst 17.0.0.1/32 flowid 1:10 action simple sdata "foobar"
      Signed-off-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      fdf1bdd0
    • Jamal Hadi Salim's avatar
      tc: don't ignore ok as an action branch · 43726b75
      Jamal Hadi Salim authored
      This is what used to happen before:
      
      tc filter add dev tap1 parent ffff: protocol 0xfefe prio 10 \
           u32 match u32 0 0 flowid 1:16 \
           action ife decode allow mark ok
      
      tc -s filter ls dev tap1 parent ffff:
      filter protocol [65278] pref 10 u32
      filter protocol [65278] pref 10 u32 fh 800: ht divisor 1
      filter protocol [65278] pref 10 u32 fh 800::800 order 2048 key ht 800
      bkt 0 flowid 1:16
        match 00000000/00000000 at 0
              action order 1: ife decode action pipe
               index 2 ref 1 bind 1 installed 4 sec used 4 sec
               type: 0x0
               Metadata: allow mark
              Action statistics:
              Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
              backlog 0b 0p requeues 0
      
              action order 2: gact action pass
               random type none pass val 0
               index 1 ref 1 bind 1 installed 4 sec used 4 sec
              Action statistics:
              Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
              backlog 0b 0p requeues 0
      
      Note the extra action added at the end..
      Signed-off-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      43726b75
    • Jamal Hadi Salim's avatar
      tc: introduce IFE action · d3e51122
      Jamal Hadi Salim authored
      This action allows for a sending side to encapsulate arbitrary metadata
      which is decapsulated by the receiving end.
      The sender runs in encoding mode and the receiver in decode mode.
      Both sender and receiver must specify the same ethertype.
      At some point we hope to have a registered ethertype and we'll
      then provide a default so the user doesnt have to specify it.
      For now we enforce the user specify it.
      
      Described in netdev01 paper:
         "Distributing Linux Traffic Control Classifier-Action Subsystem"
          Authors: Jamal Hadi Salim and Damascene M. Joachimpillai
      
      Also refer to IETF draft-ietf-forces-interfelfb-04.txt
      
      Lets show example usage where we encode icmp from a sender towards
      a receiver with an skbmark of 17; both sender and receiver use
      ethertype of 0xdead to interop.
      
      YYYY: Lets start with Receiver-side policy config:
      xxx: add an ingress qdisc
      sudo tc qdisc add dev $ETH ingress
      
      xxx: any packets with ethertype 0xdead will be subjected to ife decoding
      xxx: we then restart the classification so we can match on icmp at prio 3
      sudo $TC filter add dev $ETH parent ffff: prio 2 protocol 0xdead \
      u32 match u32 0 0 flowid 1:1 \
      action ife decode reclassify
      
      xxx: on restarting the classification from above if it was an icmp
      xxx: packet, then match it here and continue to the next rule at prio 4
      xxx: which will match based on skb mark of 17
      sudo tc filter add dev $ETH parent ffff: prio 3 protocol ip \
      u32 match ip protocol 1 0xff flowid 1:1 \
      action continue
      
      xxx: match on skbmark of 0x11 (decimal 17) and accept
      sudo tc filter add dev $ETH parent ffff: prio 4 protocol ip \
      handle 0x11 fw flowid 1:1 \
      action ok
      
      xxx: Lets show the decoding policy
      sudo tc -s filter ls dev $ETH parent ffff: protocol 0xdead
      xxx:
      filter pref 2 u32
      filter pref 2 u32 fh 800: ht divisor 1
      filter pref 2 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1  (rule hit 0 success 0)
        match 00000000/00000000 at 0 (success 0 )
      	action order 1: ife decode action reclassify type 0x0
      	 allow mark allow prio
      	 index 11 ref 1 bind 1 installed 45 sec used 45 sec
      	Action statistics:
      	Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
      	backlog 0b 0p requeues 0
      
      xxx:
      Observe that above lists all metadatum it can decode. Typically these
      submodules will already be compiled into a monolithic kernel or
      loaded as modules
      
      YYYY: Lets show the sender side now ..
      xxx: Add an egress qdisc on the sender netdev
      sudo tc qdisc add dev $ETH root handle 1: prio
      xxx:
      xxx: Match all icmp packets to 192.168.122.237/24, then
      xxx: tag the packet with skb mark of decimal 17, then
      xxx: Encode it with:
      xxx:    ethertype 0xdead
      xxx:    add skb->mark to whitelist of metadatum to send
      xxx:    rewrite target dst MAC address to 02:15:15:15:15:15
      xxx:
      sudo $TC filter add dev $ETH parent 1: protocol ip prio 10  u32 \
      match ip dst 192.168.122.237/24 \
      match ip protocol 1 0xff \
      flowid 1:2 \
      action skbedit mark 17 \
      action ife encode \
      type 0xDEAD \
      allow mark \
      dst 02:15:15:15:15:15
      
      xxx: Lets show the encoding policy
      filter pref 10 u32
      filter pref 10 u32 fh 800: ht divisor 1
      filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:2  (rule hit 118 success 0)
        match c0a87a00/ffffff00 at 16 (success 0 )
        match 00010000/00ff0000 at 8 (success 0 )
      	action order 1:  skbedit mark 17
      	 index 11 ref 1 bind 1 installed 3 sec used 3 sec
       	Action statistics:
      	Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
      	backlog 0b 0p requeues 0
      
      	action order 2: ife encode action pipe type 0xDEAD
      	 allow mark dst 02:15:15:15:15:15
      	 index 12 ref 1 bind 1 installed 3 sec used 3 sec
      	Action statistics:
      	Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
      	backlog 0b 0p requeues 0
      xxx:
      
      Now test by sending ping from sender to destination
      Signed-off-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      d3e51122
    • Stephen Hemminger's avatar
      add tc_ife.h · 29b79689
      Stephen Hemminger authored
      29b79689
  2. 13 May, 2016 11 commits
  3. 06 May, 2016 2 commits
    • Jiri Benc's avatar
      ip link gre: print only relevant info in external mode · 7c337e2c
      Jiri Benc authored
      Display only attributes that are relevant when a GRE interface is in
      'external' mode instead of the default values (which are ignored by the
      kernel even if passed back).
      
      Fixes: 926b39e1 ("gre: add support for collect metadata flag")
      Signed-off-by: default avatarJiri Benc <jbenc@redhat.com>
      7c337e2c
    • Jiri Benc's avatar
      ip link gre: create interfaces in external mode correctly · df217d5d
      Jiri Benc authored
      For GRE interfaces in 'external' mode, the kernel ignores all manual
      settings like remote IP address or TTL. However, for some of those
      attributes, kernel checks their value and does not allow them to be zero
      (even though they're ignored later).
      
      Currently, 'ip link' always includes all attributes in the netlink message.
      This leads to problem with creating interfaces in 'external' mode. For
      example, this command does not work:
      
      ip link add gre1 type gretap external
      
      and needs a bogus remote IP address to be specified, as the kernel enforces
      remote IP address to be either not present, or not null.
      
      Ignore the parameters that do not make sense in 'external' mode.
      Unfortunately, we cannot error out, as there may be existing deployments
      that workarounded the bug by specifying bogus values.
      
      Fixes: 926b39e1 ("gre: add support for collect metadata flag")
      Signed-off-by: default avatarJiri Benc <jbenc@redhat.com>
      df217d5d
  4. 03 May, 2016 1 commit
    • Quentin Monnet's avatar
      tc: add bash-completion function · 27d44f3a
      Quentin Monnet authored
      Add function for command completion for tc in bash, and update Makefile
      to install it under /usr/share/bash-completion/completions/.
      
      Inside iproute2 repository, the completion code is in a new
      `bash-completion` toplevel directory.
      
      v2: Remove `if` statement in Makefile: do not try to install in
          /etc/bash_completion.d/ if /usr/share/bash-completion/completions/
          is not found; instead, the user can override the installation path
          with the specific environment variable.
      Signed-off-by: default avatarQuentin Monnet <quentin.monnet@6wind.com>
      27d44f3a
  5. 25 Apr, 2016 1 commit
  6. 22 Apr, 2016 2 commits
  7. 19 Apr, 2016 17 commits