- 27 Apr, 2015 4 commits
-
-
Daniel Borkmann authored
This work follows upon commit 6256f8c9 ("tc, bpf: finalize eBPF support for cls and act front-end") and takes up the idea proposed by Hannes Frederic Sowa to spawn a shell (or any other command) that holds generated eBPF map file descriptors. File descriptors, based on their id, are being fetched from the same unix domain socket as demonstrated in the bpf_agent, the shell spawned via execvpe(2) and the map fds passed over the environment, and thus are made available to applications in the fashion of std{in,out,err} for read/write access, for example in case of iproute2's examples/bpf/: # env | grep BPF BPF_NUM_MAPS=3 BPF_MAP1=6 <- BPF_MAP_ID_QUEUE (id 1) BPF_MAP0=5 <- BPF_MAP_ID_PROTO (id 0) BPF_MAP2=7 <- BPF_MAP_ID_DROPS (id 2) # ls -la /proc/self/fd [...] lrwx------. 1 root root 64 Apr 14 16:46 0 -> /dev/pts/4 lrwx------. 1 root root 64 Apr 14 16:46 1 -> /dev/pts/4 lrwx------. 1 root root 64 Apr 14 16:46 2 -> /dev/pts/4 [...] lrwx------. 1 root root 64 Apr 14 16:46 5 -> anon_inode:bpf-map lrwx------. 1 root root 64 Apr 14 16:46 6 -> anon_inode:bpf-map lrwx------. 1 root root 64 Apr 14 16:46 7 -> anon_inode:bpf-map The advantage (as opposed to the direct/native usage) is that now the shell is map fd owner and applications can terminate and easily reattach to descriptors w/o any kernel changes. Moreover, multiple applications can easily read/write eBPF maps simultaneously. To further allow users for experimenting with that, next step is to add a small helper that can get along with simple data types, so that also shell scripts can make use of bpf syscall, f.e to read/write into maps. Generally, this allows for prepopulating maps, or any runtime altering which could influence eBPF program behaviour (f.e. different run-time classifications, skb modifications, ...), dumping of statistics, etc. Reference: http://thread.gmane.org/gmane.linux.network/357471/focus=357860Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Alexei Starovoitov <ast@plumgrid.com>
-
Nicolas Dichtel authored
This flag is only for the netlink protocol (multi-part messages), no reason to reject messages without it. Note that this flag was removed by the following kernel patches (v3.14) 65886f439ab0 ipmr: fix mfc notification flags f518338b1603 ip6mr: fix mfc notification flags Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
-
Nicolas Dichtel authored
The warning was: In file included from namespace.c:14:0: ../include/namespace.h: In function ‘setns’: ../include/namespace.h:37:2: warning: implicit declaration of function ‘syscall’ [-Wimplicit-function-declaration] Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
-
Nicolas Dichtel authored
The warning was: m_simple.c: In function ‘parse_simple’: m_simple.c:142:4: warning: format ‘%ld’ expects argument of type ‘long int’, but argument 3 has type ‘size_t’ [-Wformat] Useful to be able to compile with -Werror. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
-
- 20 Apr, 2015 8 commits
-
-
Vadim Kochan authored
Use correct handle buffer length. Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
-
Nicolas Dichtel authored
XFRM netlink family is independent from the route netlink family. It's wrong to call rtnl_wilddump_request(), because it will add a 'struct ifinfomsg' into the header and the kernel will complain (at least for xfrm state): netlink: 24 bytes leftover after parsing attributes in process `ip'. Reported-by: Gregory Hoggarth <Gregory.Hoggarth@alliedtelesis.co.nz> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
-
Nicolas Dichtel authored
Two commands are added: - ip netns list-id - ip monitor nsid A cache is also added to remember the association between the iproute2 netns name (from /var/run/netns/) and the nsid. To avoid interfering with the rth socket, a new rtnl socket (rtnsh) is used to get nsid (we may send rtnl request during listing on rth). Example: $ ip netns list-id nsid 0 (iproute2 netns name: foo) $ ip monitor nsid Deleted nsid 0 (iproute2 netns name: foo) nsid 16 (iproute2 netns name: bar) Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
-
Pavel Šimerda authored
See also: * https://bugzilla.redhat.com/show_bug.cgi?id=736332Signed-off-by: Pavel Šimerda <psimerda@redhat.com>
-
Pavel Šimerda authored
See also: * https://bugzilla.redhat.com/show_bug.cgi?id=977845Signed-off-by: Pavel Šimerda <psimerda@redhat.com>
-
Pavel Šimerda authored
Without modification, using the example resulted in the following error: [root@localhost sbin]# cbq restart find: warning: you have specified the -maxdepth option after a non-option argument (, but options are not positional (-maxdepth affects tests specified before it as well as those specified after it). Please specify options before other arguments. find: warning: you have specified the -maxdepth option after a non-option argument (, but options are not positional (-maxdepth affects tests specified before it as well as those specified after it). Please specify options before other arguments. **CBQ: failed to compile CBQ configuration! See also: * https://bugzilla.redhat.com/show_bug.cgi?id=539232Reported-by: Mads Kiilerich <mads@kiilerich.com> Signed-off-by: Pavel Šimerda <psimerda@redhat.com>
-
Pavel Šimerda authored
When creating an IPsec SA that sets 'proto any' (IPPROTO_IP) and specifies 'sport' and 'dport' at the same time in selector, the following error is issued: "sport" and "dport" are invalid with proto=ip However using IPPROTO_IP with ports is completely legal and necessary when one wants to share the SA on both TCP and UDP. One of the applications requiring sharing SAs is 3GPP IMS AKA authentication. See also: * https://bugzilla.redhat.com/show_bug.cgi?id=497355Reported-by: Jiří Klimeš <jklimes@redhat.com> Signed-off-by: Pavel Šimerda <psimerda@redhat.com>
-
Pavel Šimerda authored
Changes: * Accept directory settings from environment. * Remove redundant ROOTDIR variable. * Set KERNEL_INCLUDE default to '/usr/include'. * Use CFLAGS from environemnt. Note: In the long term it might be better to improve the configure script to generate those parts of the Makefile in a manner similar to autoconf. It might be even practical to autotoolize the package. Signed-off-by: Pavel Šimerda <psimerda@redhat.com>
-
- 13 Apr, 2015 9 commits
-
-
Felix Fietkau authored
Add ability to add the netfilter connmark support. Typical usage: ...lets tag outgoing icmp with mark 0x10.. iptables -tmangle -A PREROUTING -p icmp -j CONNMARK --set-mark 0x10 ..add on ingress of $ETH an extractor for connmark... tc filter add dev $ETH parent ffff: prio 4 protocol ip \ u32 match ip protocol 1 0xff \ flowid 1:1 \ action connmark continue ...if the connmark was 0x11, we police to a ridic rate of 10Kbps tc filter add dev $ETH parent ffff: prio 5 protocol ip \ handle 0x11 fw flowid 1:1 \ action police rate 10kbit burst 10k Other ways to use the connmark is to supply the zone, index and branching choice. Refer to help. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
-
Stephen Hemminger authored
Needed for later tc action patches
-
Andy Gospodarek authored
The kernel now has the capability to offload FDB and FIB entries to hardware. It is important to let users know if table entries are also offloaded to hardware. Currently offloaded FDB entries are indicated by the existence of the flag 'external' on the entry as of the following commit: commit 28467b7f Author: Scott Feldman <sfeldma@gmail.com> Date: Thu Dec 4 09:57:15 2014 +0100 bridge/fdb: add flag/indication for FDB entry synced from offload device When the patch to add support for indicating that FIB entries were also offloaded as posted to netdev by Scott Feldman it became clear that 'external' would not be an ideal name for routes. There could definitely be confusion about what this might mean since many routes are to external networks -- a collision/confusion that did not happen with FDB. Scott Feldman asked me to check with others and build concensus around a name. After speaking with several people about this I am proposing we refer to both FDB and FIB entries that are currently backed by hardware (based on the work done in rocker) with the flag 'offload' appended to the end ofthe entry. Some people liked the string 'external,' others liked 'hardware,' but the point is to communicate that these routes are available to something that will will offload the forwarding normally done by the kernel. Since the term 'offload' is used so frequently it seems appropriate to use the same language in ip/bridge output. The term 'offload' also seems to resonate with many of the people who have responded on Scott's original thread or to those who I reached out to directly and did respond to my query, so it seems we have reached consensus that it should be the term used going forward. v2: rebased against net-next branch Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com> CC: Jamal Hadi Salim <jhs@mojatatu.com> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: Jiri Pirko <jiri@resnulli.us> CC: John W. Linville <linville@tuxdriver.com> CC: Roopa Prabhu <roopa@cumulusnetworks.com> CC: Scott Feldman <sfeldma@gmail.com> CC: Stephen Hemminger <stephen@networkplumber.org>
-
Stephen Hemminger authored
-
Stephen Hemminger authored
-
Stephen Hemminger authored
-
Nicolas Dichtel authored
The goal of this patch is to test during the runtime if the command RTM_GETNSID is supported by the kernel. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
-
Nicolas Dichtel authored
This reverts commit d116ff34. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
-
Nicolas Dichtel authored
This reverts commit d059de70. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
-
- 10 Apr, 2015 8 commits
-
-
Daniel Borkmann authored
This work finalizes both eBPF front-ends for the classifier and action part in tc, it allows for custom ELF section selection, a simplified tc command frontend (while keeping compat), reusing of common maps between classifier and actions residing in the same object file, and exporting of all map fds to an eBPF agent for handing off further control in user space. It also adds an extensive example of how eBPF can be used, and a minimal self-contained example agent that dumps map data. The example is well documented and hopefully provides a good starting point into programming cls_bpf and act_bpf. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Jiri Pirko <jiri@resnulli.us> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
-
Stephen Hemminger authored
-
Vadim Kochan authored
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
-
Jiri Benc authored
Fixes: d116ff34 ("ip netns: Fix rtnl error while print netns list") Signed-off-by: Jiri Benc <jbenc@redhat.com>
-
Christophe Gouault authored
- document ip xfrm policy set - update ip xfrm monitor documentation - in DESCRIPTION section, reorganize grouping of commands Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com>
-
Christophe Gouault authored
add a new command to configure the SPD hash table: ip xfrm policy set [ hthresh4 LBITS RBITS ] [ hthresh6 LBITS RBITS ] and code to display the SPD hash configuration: ip -s -s xfrm policy count hthresh4: defines minimum local and remote IPv4 prefix lengths of selectors to hash a policy. If prefix lengths are greater or equal to the thresholds, then the policy is hashed, otherwise it falls back in the policy_inexact chained list. hthresh6: defines minimum local and remote IPv6 prefix lengths of selectors to hash a policy, otherwise it falls back in the policy_inexact chained list. Example: % ip -s -s xfrm policy count SPD IN 0 OUT 0 FWD 0 (Sock: IN 0 OUT 0 FWD 0) SPD buckets: count 7 Max 1048576 SPD IPv4 thresholds: local 32 remote 32 SPD IPv6 thresholds: local 128 remote 128 % ip xfrm pol set hthresh4 24 16 hthresh6 64 56 % ip -s -s xfrm policy count SPD IN 0 OUT 0 FWD 0 (Sock: IN 0 OUT 0 FWD 0) SPD buckets: count 7 Max 1048576 SPD IPv4 thresholds: local 24 remote 16 SPD IPv6 thresholds: local 64 remote 56 Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com>
-
Stephen Hemminger authored
Current santized kernel headers from net-next
-
Stephen Hemminger authored
Need to include netinet/in.h to get the correct glibc headers instead of getting definitions in linux/in6.h
-
- 07 Apr, 2015 6 commits
-
-
Stephen Hemminger authored
Conflicts: man/man8/ip-route.8.in
-
Pavel Šimerda authored
Result of the following command: sed -ri 's/\. /. /g' man/*/* Signed-Off-By: Pavel Šimerda <psimerda@redhat.com>
-
Vadim Kochan authored
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
-
Vadim Kochan authored
Output of the usage was shifted be cause of missing TAB Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
-
Vadim Kochan authored
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
-
Vadim Kochan authored
If '-nm' specified that do not fail if there is no default class names file in /etc/iproute2. Changed default class name file cls_names -> tc_cls. Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
-
- 24 Mar, 2015 5 commits
-
-
Lubomir Rintel authored
This allows querying and setting the route preference. It's usually set from the IPv6 Neighbor Discovery Router Advertisement messages. Introduced in "ipv6: expose RFC4191 route preference via rtnetlink", enqueued for Linux 4.1. Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
-
Eric W. Biederman authored
- Pull in the uapi mpls.h - Update rtnetlink.h to include the mpls rtnetlink notification multicast group. - Define AF_MPLS in utils.h if it is not defined from elsewhere as is done with AF_DECnet The address syntax for multiple mpls labels is a complete invention. When I looked there seemed to be no wide spread convention for talking about an mpls label stack in text for. Sometimes people did: "{ Label1, Label2, Label3 }", sometimes people would do: "[ label3, label2, label1 ]", and most of the time label stacks were not explicitly shown at all. The syntax I wound up using, so it would not have spaces and so it would visually distinct from other kinds of addresses is. label1/label2/label3 Where label1 is the label at the top of the label stack and label3 is the label at the bottom on the label stack. When there is a single label this matches what seems to be convention with other tools. Just print out the numeric value of the mpls label. The netlink protocol for labels uses the on the wire format for a label stack. The ttl and traffic class are expected to be 0. Using the on the wire format is common and what happens with other address types. BGP when passing label stacks also uses this technique with the exception that the ttl byte is not included making each label in a BGP label stack 3 bytes instead of 4. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
-
Eric W. Biederman authored
This attribute is like RTA_DST except it specifies the destination address to place on a packet when it leaves the host. For ip based protocols this is destination NAT and not a common part of forwarding. For protocols like MPLS label swapping is something that typically happens on every hop. There is likely to be a RTA_NEWSRC at some point so RTA_NEWDST is printed as "as to" and can be specified either as "as to" or just "as" Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
-
Eric W. Biederman authored
Add support for the RTA_VIA attribute that specifies an address family as well as an address for the next hop gateway. To make it easy to pass this reorder inet_prefix so that it's tail is a proper RTA_VIA attribute. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
-
Eric W. Biederman authored
-