1. 13 Apr, 2015 9 commits
    • Felix Fietkau's avatar
      tc: add support for connmark action · b8d5c9a7
      Felix Fietkau authored
      Add ability to add the netfilter connmark support.
      
      Typical usage:
      ...lets tag outgoing icmp with mark 0x10..
      iptables -tmangle -A PREROUTING -p icmp -j CONNMARK --set-mark 0x10
      ..add on ingress of $ETH an extractor for connmark...
      tc filter add dev $ETH parent ffff: prio 4 protocol ip \
      u32 match ip protocol 1 0xff \
      flowid 1:1 \
      action connmark continue
      ...if the connmark was 0x11, we police to a ridic rate of 10Kbps
      tc filter add dev $ETH parent ffff: prio 5 protocol ip \
      handle 0x11 fw flowid 1:1 \
      action police rate 10kbit burst 10k
      
      Other ways to use the connmark is to supply the zone, index and
      branching choice. Refer to help.
      Signed-off-by: default avatarFelix Fietkau <nbd@openwrt.org>
      Signed-off-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      b8d5c9a7
    • Stephen Hemminger's avatar
      update kernel headers and add tc_connmark.h · 94f66538
      Stephen Hemminger authored
      Needed for later tc action patches
      94f66538
    • Andy Gospodarek's avatar
      iproute2: unify naming for entries offloaded to hardware · aa05b988
      Andy Gospodarek authored
      The kernel now has the capability to offload FDB and FIB entries to hardware.
      It is important to let users know if table entries are also offloaded to
      hardware.  Currently offloaded FDB entries are indicated by the existence of
      the flag 'external' on the entry as of the following commit:
      
      commit 28467b7f
      Author: Scott Feldman <sfeldma@gmail.com>
      Date:   Thu Dec 4 09:57:15 2014 +0100
      
          bridge/fdb: add flag/indication for FDB entry synced from offload device
      
      When the patch to add support for indicating that FIB entries were also
      offloaded as posted to netdev by Scott Feldman it became clear that 'external'
      would not be an ideal name for routes.  There could definitely be confusion
      about what this might mean since many routes are to external networks -- a
      collision/confusion that did not happen with FDB.
      
      Scott Feldman asked me to check with others and build concensus around a name.
      After speaking with several people about this I am proposing we refer to both
      FDB and FIB entries that are currently backed by hardware (based on the work
      done in rocker) with the flag 'offload' appended to the end ofthe entry.
      
      Some people liked the string 'external,' others liked 'hardware,' but the point
      is to communicate that these routes are available to something that will will
      offload the forwarding normally done by the kernel.  Since the term 'offload'
      is used so frequently it seems appropriate to use the same language in
      ip/bridge output.
      
      The term 'offload' also seems to resonate with many of the people who have
      responded on Scott's original thread or to those who I reached out to directly
      and did respond to my query, so it seems we have reached consensus that it
      should be the term used going forward.
      
      v2: rebased against net-next branch
      Signed-off-by: default avatarAndy Gospodarek <gospo@cumulusnetworks.com>
      CC: Jamal Hadi Salim <jhs@mojatatu.com>
      CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      CC: Jiri Pirko <jiri@resnulli.us>
      CC: John W. Linville <linville@tuxdriver.com>
      CC: Roopa Prabhu <roopa@cumulusnetworks.com>
      CC: Scott Feldman <sfeldma@gmail.com>
      CC: Stephen Hemminger <stephen@networkplumber.org>
      aa05b988
    • Stephen Hemminger's avatar
      Merge branch 'master' into net-next · 93531fac
      Stephen Hemminger authored
      93531fac
    • Stephen Hemminger's avatar
      fix whitespace · 672acc72
      Stephen Hemminger authored
      672acc72
    • Stephen Hemminger's avatar
      v4.0.0 · aed6d85d
      Stephen Hemminger authored
      aed6d85d
    • Nicolas Dichtel's avatar
      ipnetns: add a runtime check for RTM_GETNSID support · 4c7d9a58
      Nicolas Dichtel authored
      The goal of this patch is to test during the runtime if the command RTM_GETNSID
      is supported by the kernel.
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      4c7d9a58
    • Nicolas Dichtel's avatar
      5a2ce868
    • Nicolas Dichtel's avatar
      694ed195
  2. 10 Apr, 2015 8 commits
  3. 07 Apr, 2015 6 commits
  4. 24 Mar, 2015 17 commits
    • Lubomir Rintel's avatar
      ip: support RFC4191 router preference · 194e9b85
      Lubomir Rintel authored
      This allows querying and setting the route preference. It's usually set from
      the IPv6 Neighbor Discovery Router Advertisement messages.
      
      Introduced in "ipv6: expose RFC4191 route preference via rtnetlink", enqueued
      for Linux 4.1.
      Signed-off-by: default avatarLubomir Rintel <lkundrak@v3.sk>
      194e9b85
    • Eric W. Biederman's avatar
      add basic mpls support to iproute · dacc5d41
      Eric W. Biederman authored
      - Pull in the uapi mpls.h
      - Update rtnetlink.h to include the mpls rtnetlink notification multicast group.
      - Define AF_MPLS in utils.h if it is not defined from elsewhere
        as is done with AF_DECnet
      
      The address syntax for multiple mpls labels is a complete invention.
      When I looked there seemed to be no wide spread convention for talking
      about an mpls label stack in text for.  Sometimes people did:
      "{ Label1, Label2, Label3 }", sometimes people would do:
      "[ label3, label2, label1 ]", and most of the time label
      stacks were not explicitly shown at all.
      
      The syntax I wound up using, so it would not have spaces and so it
      would visually distinct from other kinds of addresses is.
      
      label1/label2/label3 Where label1 is the label at the top of the label
      stack and label3 is the label at the bottom on the label stack.
      
      When there is a single label this matches what seems to be convention
      with other tools.  Just print out the numeric value of the mpls label.
      
      The netlink protocol for labels uses the on the wire format for a
      label stack. The ttl and traffic class are expected to be 0.  Using
      the on the wire format is common and what happens with other address
      types. BGP when passing label stacks also uses this technique with the
      exception that the ttl byte is not included making each label in a BGP
      label stack 3 bytes instead of 4.
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      dacc5d41
    • Eric W. Biederman's avatar
      add support for the RTA_NEWDST attribute. · 6f7a9f4d
      Eric W. Biederman authored
      This attribute is like RTA_DST except it specifies the destination
      address to place on a packet when it leaves the host.  For ip based
      protocols this is destination NAT and not a common part of forwarding.
      For protocols like MPLS label swapping is something that typically
      happens on every hop.
      
      There is likely to be a RTA_NEWSRC at some point so RTA_NEWDST
      is printed as "as to"  and can be specified either as "as to"
      or just "as"
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      6f7a9f4d
    • Eric W. Biederman's avatar
      add support for the RTA_VIA attribute · 93ae2835
      Eric W. Biederman authored
      Add support for the RTA_VIA attribute that specifies an address family
      as well as an address for the next hop gateway.
      
      To make it easy to pass this reorder inet_prefix so that it's tail
      is a proper RTA_VIA attribute.
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      93ae2835
    • Eric W. Biederman's avatar
      misc whitespace cleanup · 8e8f8de4
      Eric W. Biederman authored
      8e8f8de4
    • Eric W. Biederman's avatar
      add address family to/from string helper functions. · 45c90d19
      Eric W. Biederman authored
      Add the functions family_name and read_family to convert an address
      family to a string and to convernt a string to an address family.
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      45c90d19
    • Eric W. Biederman's avatar
    • Eric W. Biederman's avatar
      make the addr argument of ll_addr_n2a const · 71b4d59b
      Eric W. Biederman authored
      This avoids build warnings when AF_PACKET support is added
      to rt_addr_n2a.
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      71b4d59b
    • Eric W. Biederman's avatar
      add a source addres length parameter to rt_addr_n2a · 26dcdf3a
      Eric W. Biederman authored
      For some address families (like AF_PACKET) it is helpful to have the
      length when prenting the address.
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      26dcdf3a
    • Daniel Borkmann's avatar
      tc: add eBPF support to f_bpf · 11c39b5e
      Daniel Borkmann authored
      This work adds the tc frontend for kernel commit e2e9b6541dd4 ("cls_bpf:
      add initial eBPF support for programmable classifiers").
      
      A C-like classifier program (f.e. see e2e9b6541dd4) is being compiled via
      LLVM's eBPF backend into an ELF file, that is then being passed to tc. tc
      then loads, if any, eBPF maps and eBPF opcodes (with fixed-up eBPF map file
      descriptors) out of its dedicated sections, and via bpf(2) into the kernel
      and then the resulting fd via netlink down to cls_bpf. cls_bpf allows for
      annotations, currently, I've used the file name for that, so that the user
      can easily identify his filter when dumping configurations back.
      
      Example usage:
      
        clang -O2 -emit-llvm -c cls.c -o - | llc -march=bpf -filetype=obj -o cls.o
        tc filter add dev em1 parent 1: bpf run object-file cls.o classid x:y
      
        tc filter show dev em1 [...]
        filter parent 1: protocol all pref 49152 bpf handle 0x1 flowid x:y cls.o
      
      I placed the parser bits derived from Alexei's kernel sample, into tc_bpf.c
      as my next step is to also add the same support for BPF action, so we can
      have a fully fledged eBPF classifier and action in tc.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      11c39b5e
    • Stephen Hemminger's avatar
      update kernel headers to net-next 4.0-rc5 · cbdc3ed8
      Stephen Hemminger authored
      Lastest features
      cbdc3ed8
    • Daniel Borkmann's avatar
      misc: header rebase, add bpf.h · b54ac87e
      Daniel Borkmann authored
      Include the bpf.h uapi header file.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      b54ac87e
    • Madhu Challa's avatar
      ip: enable configuring multicast group autojoin · e31867ac
      Madhu Challa authored
      Joining multicast group on ethernet level via "ip maddr" command would
      not work if we have an Ethernet switch that does igmp snooping since
      the switch would not replicate multicast packets on ports that did not
      have IGMP reports for the multicast addresses.
      
      Linux vxlan interfaces created via "ip link add vxlan" have the group option
      that enables then to do the required join.
      
      By extending ip address command with option "autojoin" we can get similar
      functionality for openvswitch vxlan interfaces as well as other tunneling
      mechanisms that need to receive multicast traffic.
      
      example:
      ip address add 224.1.1.10/24 dev eth5 autojoin
      ip address del 224.1.1.10/24 dev eth5
      e31867ac
    • Scott Feldman's avatar
      route: label externally offloaded routes · 655444bd
      Scott Feldman authored
      On ip route print dump, label externally offloaded routes with "external".
      Offloaded routes are flagged with RTNH_F_EXTERNAL, a recent additon to
      net-next.  For example:
      
      $ ip route
      default via 192.168.0.2 dev eth0
      11.0.0.0/30 dev swp1  proto kernel  scope link  src 11.0.0.2 external
      11.0.0.4/30 via 11.0.0.1 dev swp1  proto zebra  metric 20 external
      11.0.0.8/30 dev swp2  proto kernel  scope link  src 11.0.0.10 external
      11.0.0.12/30 via 11.0.0.9 dev swp2  proto zebra  metric 20 external
      12.0.0.2  proto zebra  metric 30 external
              nexthop via 11.0.0.1  dev swp1 weight 1
              nexthop via 11.0.0.9  dev swp2 weight 1
      12.0.0.3 via 11.0.0.1 dev swp1  proto zebra  metric 20 external
      12.0.0.4 via 11.0.0.9 dev swp2  proto zebra  metric 20 external
      192.168.0.0/24 dev eth0  proto kernel  scope link  src 192.168.0.15
      Signed-off-by: default avatarScott Feldman <sfeldma@gmail.com>
      Reviewed-by: default avatarJiri Pirko <jiri@resnulli.us>
      655444bd
    • Stephen Hemminger's avatar
      update headers files for net-next · 61333d24
      Stephen Hemminger authored
      Use sanitized headers from 4.0.0-rc3
      61333d24
    • Daniel Borkmann's avatar
      tc: m_bpf: fix next arg selection after tc opcode · 51cf3675
      Daniel Borkmann authored
      Next argument after the tc opcode/verdict is optional, using NEXT_ARG()
      requires to have another argument after that one otherwise tc will bail
      out. Therefore, we need to advance to the next argument manually as done
      elsewhere.
      
      Fixes: 86ab59a6 ("tc: add support for BPF based actions")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarJiri Pirko <jiri@resnulli.us>
      51cf3675
    • Vadim Kochan's avatar
      599fc319