Commit 640a171c authored by Alexei Starovoitov's avatar Alexei Starovoitov

Merge branch 'samples/bpf: xdpsock app enhancements'

Ong Boon says:

====================

First of all, sorry for taking more time to get back to this series and
thanks to all valuble feedback in series-1 at [1] from Jesper and Song
Liu.

Since then I have looked into what Jesper suggested in [2] and worked on
revising the patch series into several patches for ease of review:

v1->v2:
1/7: [No change]. Add VLAN tag (ID & Priority) to the generated Tx-Only
     frames.

2/7: [No change]. Add DMAC and SMAC setting to the generated Tx-Only
     frames. If parameters are not set, previous DMAC and SMAC are used.

3/7: [New]. Add support for selecting different CLOCK for clock_gettime()
     used in get_nsecs.

4/7: [New]. This is a total rework from series-1 3/4-patch [3]. It uses
     clock_nanosleep() suggested by Jesper. In addition, added statistic
     for Tx schedule variance under application stat (-a|--app-stats).
     Make the cyclic Tx operation and --poll mode to be mutually-
     exclusive. Still, the ability to specify TX cycle time and used
     together with batch size and packet count remain the same.

5/7: [New]. Add the support for TX process schedule policy and priority
     setting. By default, SCHED_OTHER policy is used. This too is matching
     the schedule policy setting in [2].

6/7: [Change]. This is update from series-1 4/4-patch [4]. Added TX clean
     process time-out in 1s granularity with configurable retries count
     (-O|--retries).

7/7: [New]. Added timestamp for TX packet following pktgen_hdr format
     matching the implementation in [2]. However, the sequence ID remains
     the same as it is instead of process schedule diff in [2].

To summarize on what program options have been added with v2 series
using an example below:-

 DMAC (-G)                 = fa:8d:f1:e2:0b:e8
 SMAC (-H)                 = ce:17:07:17:3e:3a

 VLAN tagged (-V)
 VLAN ID (-J)              = 12
 VLAN Pri (-K)             = 3

 Tx Queue (-q)             = 3
 Cycle Time in us (-T)     = 1000
 Batch (-b)                = 2
 Packet Count              = 6
 Tx schedule policy (-W)   = FIFO
 Tx schedule priority (-U) = 50
 Clock selection (-w)      = REALTIME

 Tx timeout retries(-O)    = 5
 Tx timestamp (-y)
 Cyclic Tx schedule stat (-a)

Note: xdpsock sets UDP dest-port and src-port to 0x1000 as default.

 Sending Board
 =============
 $ xdpsock -i eth0 -t -N -z -H ce:17:07:17:3e:3a -G fa:8d:f1:e2:0b:e8 \
   -V -J 12 -K 3 -q 3 \
   -T 1000 -b 2 -C 6 -W FIFO -U 50 -w REALTIME \
   -O 5 -y -a

  sock0@eth0:3 txonly xdp-drv
                    pps            pkts           0.00
 rx                 0              0
 tx                 0              6

                    calls/s        count
 rx empty polls     0              0
 fill fail polls    0              0
 copy tx sendtos    0              0
 tx wakeup sendtos  0              5
 opt polls          0              0

                    period     min        ave        max        cycle
 Cyclic TX          1000000    31033      32009      33397      3

 Receiving Board
 ===============
 $ tcpdump -nei eth0 udp port 0x1000 -vv -Q in -X \
    --time-stamp-precision nano
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
03:46:40.520111580 ce:17:07:17:3e:3a > fa:8d:f1:e2:0b:e8, ethertype 802.1Q (0x8100), length 62: vlan 12, p 3, ethertype IPv4, (tos 0x0, ttl 64, id 0, offset 0, flags [none], proto UDP (17), length 44)
    10.10.10.16.4096 > 10.10.10.32.4096: [udp sum ok] UDP, length 16
        0x0000:  4500 002c 0000 0000 4011 527e 0a0a 0a10  E..,....@.R~....
        0x0010:  0a0a 0a20 1000 1000 0018 e997 be9b e955  ...............U
        0x0020:  0000 0000 61cd 2ba1 0006 987c            ....a.+....|
03:46:40.520112163 ce:17:07:17:3e:3a > fa:8d:f1:e2:0b:e8, ethertype 802.1Q (0x8100), length 62: vlan 12, p 3, ethertype IPv4, (tos 0x0, ttl 64, id 0, offset 0, flags [none], proto UDP (17), length 44)
    10.10.10.16.4096 > 10.10.10.32.4096: [udp sum ok] UDP, length 16
        0x0000:  4500 002c 0000 0000 4011 527e 0a0a 0a10  E..,....@.R~....
        0x0010:  0a0a 0a20 1000 1000 0018 e996 be9b e955  ...............U
        0x0020:  0000 0001 61cd 2ba1 0006 987c            ....a.+....|
03:46:40.521066860 ce:17:07:17:3e:3a > fa:8d:f1:e2:0b:e8, ethertype 802.1Q (0x8100), length 62: vlan 12, p 3, ethertype IPv4, (tos 0x0, ttl 64, id 0, offset 0, flags [none], proto UDP (17), length 44)
    10.10.10.16.4096 > 10.10.10.32.4096: [udp sum ok] UDP, length 16
        0x0000:  4500 002c 0000 0000 4011 527e 0a0a 0a10  E..,....@.R~....
        0x0010:  0a0a 0a20 1000 1000 0018 e5af be9b e955  ...............U
        0x0020:  0000 0002 61cd 2ba1 0006 9c62            ....a.+....b
03:46:40.521067012 ce:17:07:17:3e:3a > fa:8d:f1:e2:0b:e8, ethertype 802.1Q (0x8100), length 62: vlan 12, p 3, ethertype IPv4, (tos 0x0, ttl 64, id 0, offset 0, flags [none], proto UDP (17), length 44)
    10.10.10.16.4096 > 10.10.10.32.4096: [udp sum ok] UDP, length 16
        0x0000:  4500 002c 0000 0000 4011 527e 0a0a 0a10  E..,....@.R~....
        0x0010:  0a0a 0a20 1000 1000 0018 e5ae be9b e955  ...............U
        0x0020:  0000 0003 61cd 2ba1 0006 9c62            ....a.+....b
03:46:40.522061935 ce:17:07:17:3e:3a > fa:8d:f1:e2:0b:e8, ethertype 802.1Q (0x8100), length 62: vlan 12, p 3, ethertype IPv4, (tos 0x0, ttl 64, id 0, offset 0, flags [none], proto UDP (17), length 44)
    10.10.10.16.4096 > 10.10.10.32.4096: [udp sum ok] UDP, length 16
        0x0000:  4500 002c 0000 0000 4011 527e 0a0a 0a10  E..,....@.R~....
        0x0010:  0a0a 0a20 1000 1000 0018 e1c5 be9b e955  ...............U
        0x0020:  0000 0004 61cd 2ba1 0006 a04a            ....a.+....J
03:46:40.522062173 ce:17:07:17:3e:3a > fa:8d:f1:e2:0b:e8, ethertype 802.1Q (0x8100), length 62: vlan 12, p 3, ethertype IPv4, (tos 0x0, ttl 64, id 0, offset 0, flags [none], proto UDP (17), length 44)
    10.10.10.16.4096 > 10.10.10.32.4096: [udp sum ok] UDP, length 16
        0x0000:  4500 002c 0000 0000 4011 527e 0a0a 0a10  E..,....@.R~....
        0x0010:  0a0a 0a20 1000 1000 0018 e1c4 be9b e955  ...............U
        0x0020:  0000 0005 61cd 2ba1 0006 a04a            ....a.+....J

I have tested the above with both tagged and untagged packet format and
based on the timestamp in tcpdump found that the timing of the batch
cyclic transmission is correct.

Appreciate if community can give the patch series v2 a try and point out
any gap.

Thanks
Boon Leong

[1] https://patchwork.kernel.org/project/netdevbpf/cover/20211124091821.3916046-1-boon.leong.ong@intel.com/
[2] https://github.com/netoptimizer/network-testing/blob/master/src/udp_pacer.c
[3] https://patchwork.kernel.org/project/netdevbpf/patch/20211124091821.3916046-4-boon.leong.ong@intel.com/
[4] https://patchwork.kernel.org/project/netdevbpf/patch/20211124091821.3916046-5-boon.leong.ong@intel.com/
====================
Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
parents 5f608264 eb68db45
...@@ -14,6 +14,7 @@ ...@@ -14,6 +14,7 @@
#include <arpa/inet.h> #include <arpa/inet.h>
#include <locale.h> #include <locale.h>
#include <net/ethernet.h> #include <net/ethernet.h>
#include <netinet/ether.h>
#include <net/if.h> #include <net/if.h>
#include <poll.h> #include <poll.h>
#include <pthread.h> #include <pthread.h>
...@@ -30,6 +31,7 @@ ...@@ -30,6 +31,7 @@
#include <sys/un.h> #include <sys/un.h>
#include <time.h> #include <time.h>
#include <unistd.h> #include <unistd.h>
#include <sched.h>
#include <bpf/libbpf.h> #include <bpf/libbpf.h>
#include <bpf/xsk.h> #include <bpf/xsk.h>
...@@ -56,12 +58,27 @@ ...@@ -56,12 +58,27 @@
#define DEBUG_HEXDUMP 0 #define DEBUG_HEXDUMP 0
#define VLAN_PRIO_MASK 0xe000 /* Priority Code Point */
#define VLAN_PRIO_SHIFT 13
#define VLAN_VID_MASK 0x0fff /* VLAN Identifier */
#define VLAN_VID__DEFAULT 1
#define VLAN_PRI__DEFAULT 0
#define NSEC_PER_SEC 1000000000UL
#define NSEC_PER_USEC 1000
#define SCHED_PRI__DEFAULT 0
typedef __u64 u64; typedef __u64 u64;
typedef __u32 u32; typedef __u32 u32;
typedef __u16 u16; typedef __u16 u16;
typedef __u8 u8; typedef __u8 u8;
static unsigned long prev_time; static unsigned long prev_time;
static long tx_cycle_diff_min;
static long tx_cycle_diff_max;
static double tx_cycle_diff_ave;
static long tx_cycle_cnt;
enum benchmark_type { enum benchmark_type {
BENCH_RXDROP = 0, BENCH_RXDROP = 0,
...@@ -81,14 +98,23 @@ static u32 opt_batch_size = 64; ...@@ -81,14 +98,23 @@ static u32 opt_batch_size = 64;
static int opt_pkt_count; static int opt_pkt_count;
static u16 opt_pkt_size = MIN_PKT_SIZE; static u16 opt_pkt_size = MIN_PKT_SIZE;
static u32 opt_pkt_fill_pattern = 0x12345678; static u32 opt_pkt_fill_pattern = 0x12345678;
static bool opt_vlan_tag;
static u16 opt_pkt_vlan_id = VLAN_VID__DEFAULT;
static u16 opt_pkt_vlan_pri = VLAN_PRI__DEFAULT;
static struct ether_addr opt_txdmac = {{ 0x3c, 0xfd, 0xfe,
0x9e, 0x7f, 0x71 }};
static struct ether_addr opt_txsmac = {{ 0xec, 0xb1, 0xd7,
0x98, 0x3a, 0xc0 }};
static bool opt_extra_stats; static bool opt_extra_stats;
static bool opt_quiet; static bool opt_quiet;
static bool opt_app_stats; static bool opt_app_stats;
static const char *opt_irq_str = ""; static const char *opt_irq_str = "";
static u32 irq_no; static u32 irq_no;
static int irqs_at_init = -1; static int irqs_at_init = -1;
static u32 sequence;
static int opt_poll; static int opt_poll;
static int opt_interval = 1; static int opt_interval = 1;
static int opt_retries = 3;
static u32 opt_xdp_bind_flags = XDP_USE_NEED_WAKEUP; static u32 opt_xdp_bind_flags = XDP_USE_NEED_WAKEUP;
static u32 opt_umem_flags; static u32 opt_umem_flags;
static int opt_unaligned_chunks; static int opt_unaligned_chunks;
...@@ -100,6 +126,27 @@ static u32 opt_num_xsks = 1; ...@@ -100,6 +126,27 @@ static u32 opt_num_xsks = 1;
static u32 prog_id; static u32 prog_id;
static bool opt_busy_poll; static bool opt_busy_poll;
static bool opt_reduced_cap; static bool opt_reduced_cap;
static clockid_t opt_clock = CLOCK_MONOTONIC;
static unsigned long opt_tx_cycle_ns;
static int opt_schpolicy = SCHED_OTHER;
static int opt_schprio = SCHED_PRI__DEFAULT;
static bool opt_tstamp;
struct vlan_ethhdr {
unsigned char h_dest[6];
unsigned char h_source[6];
__be16 h_vlan_proto;
__be16 h_vlan_TCI;
__be16 h_vlan_encapsulated_proto;
};
#define PKTGEN_MAGIC 0xbe9be955
struct pktgen_hdr {
__be32 pgh_magic;
__be32 seq_num;
__be32 tv_sec;
__be32 tv_usec;
};
struct xsk_ring_stats { struct xsk_ring_stats {
unsigned long rx_npkts; unsigned long rx_npkts;
...@@ -156,15 +203,63 @@ struct xsk_socket_info { ...@@ -156,15 +203,63 @@ struct xsk_socket_info {
u32 outstanding_tx; u32 outstanding_tx;
}; };
static const struct clockid_map {
const char *name;
clockid_t clockid;
} clockids_map[] = {
{ "REALTIME", CLOCK_REALTIME },
{ "TAI", CLOCK_TAI },
{ "BOOTTIME", CLOCK_BOOTTIME },
{ "MONOTONIC", CLOCK_MONOTONIC },
{ NULL }
};
static const struct sched_map {
const char *name;
int policy;
} schmap[] = {
{ "OTHER", SCHED_OTHER },
{ "FIFO", SCHED_FIFO },
{ NULL }
};
static int num_socks; static int num_socks;
struct xsk_socket_info *xsks[MAX_SOCKS]; struct xsk_socket_info *xsks[MAX_SOCKS];
int sock; int sock;
static int get_clockid(clockid_t *id, const char *name)
{
const struct clockid_map *clk;
for (clk = clockids_map; clk->name; clk++) {
if (strcasecmp(clk->name, name) == 0) {
*id = clk->clockid;
return 0;
}
}
return -1;
}
static int get_schpolicy(int *policy, const char *name)
{
const struct sched_map *sch;
for (sch = schmap; sch->name; sch++) {
if (strcasecmp(sch->name, name) == 0) {
*policy = sch->policy;
return 0;
}
}
return -1;
}
static unsigned long get_nsecs(void) static unsigned long get_nsecs(void)
{ {
struct timespec ts; struct timespec ts;
clock_gettime(CLOCK_MONOTONIC, &ts); clock_gettime(opt_clock, &ts);
return ts.tv_sec * 1000000000UL + ts.tv_nsec; return ts.tv_sec * 1000000000UL + ts.tv_nsec;
} }
...@@ -257,6 +352,15 @@ static void dump_app_stats(long dt) ...@@ -257,6 +352,15 @@ static void dump_app_stats(long dt)
xsks[i]->app_stats.prev_tx_wakeup_sendtos = xsks[i]->app_stats.tx_wakeup_sendtos; xsks[i]->app_stats.prev_tx_wakeup_sendtos = xsks[i]->app_stats.tx_wakeup_sendtos;
xsks[i]->app_stats.prev_opt_polls = xsks[i]->app_stats.opt_polls; xsks[i]->app_stats.prev_opt_polls = xsks[i]->app_stats.opt_polls;
} }
if (opt_tx_cycle_ns) {
printf("\n%-18s %-10s %-10s %-10s %-10s %-10s\n",
"", "period", "min", "ave", "max", "cycle");
printf("%-18s %-10lu %-10lu %-10lu %-10lu %-10lu\n",
"Cyclic TX", opt_tx_cycle_ns, tx_cycle_diff_min,
(long)(tx_cycle_diff_ave / tx_cycle_cnt),
tx_cycle_diff_max, tx_cycle_cnt);
}
} }
static bool get_interrupt_number(void) static bool get_interrupt_number(void)
...@@ -740,29 +844,69 @@ static inline u16 udp_csum(u32 saddr, u32 daddr, u32 len, ...@@ -740,29 +844,69 @@ static inline u16 udp_csum(u32 saddr, u32 daddr, u32 len,
#define ETH_FCS_SIZE 4 #define ETH_FCS_SIZE 4
#define PKT_HDR_SIZE (sizeof(struct ethhdr) + sizeof(struct iphdr) + \ #define ETH_HDR_SIZE (opt_vlan_tag ? sizeof(struct vlan_ethhdr) : \
sizeof(struct udphdr)) sizeof(struct ethhdr))
#define PKTGEN_HDR_SIZE (opt_tstamp ? sizeof(struct pktgen_hdr) : 0)
#define PKT_HDR_SIZE (ETH_HDR_SIZE + sizeof(struct iphdr) + \
sizeof(struct udphdr) + PKTGEN_HDR_SIZE)
#define PKTGEN_HDR_OFFSET (ETH_HDR_SIZE + sizeof(struct iphdr) + \
sizeof(struct udphdr))
#define PKTGEN_SIZE_MIN (PKTGEN_HDR_OFFSET + sizeof(struct pktgen_hdr) + \
ETH_FCS_SIZE)
#define PKT_SIZE (opt_pkt_size - ETH_FCS_SIZE) #define PKT_SIZE (opt_pkt_size - ETH_FCS_SIZE)
#define IP_PKT_SIZE (PKT_SIZE - sizeof(struct ethhdr)) #define IP_PKT_SIZE (PKT_SIZE - ETH_HDR_SIZE)
#define UDP_PKT_SIZE (IP_PKT_SIZE - sizeof(struct iphdr)) #define UDP_PKT_SIZE (IP_PKT_SIZE - sizeof(struct iphdr))
#define UDP_PKT_DATA_SIZE (UDP_PKT_SIZE - sizeof(struct udphdr)) #define UDP_PKT_DATA_SIZE (UDP_PKT_SIZE - \
(sizeof(struct udphdr) + PKTGEN_HDR_SIZE))
static u8 pkt_data[XSK_UMEM__DEFAULT_FRAME_SIZE]; static u8 pkt_data[XSK_UMEM__DEFAULT_FRAME_SIZE];
static void gen_eth_hdr_data(void) static void gen_eth_hdr_data(void)
{ {
struct udphdr *udp_hdr = (struct udphdr *)(pkt_data + struct pktgen_hdr *pktgen_hdr;
struct udphdr *udp_hdr;
struct iphdr *ip_hdr;
if (opt_vlan_tag) {
struct vlan_ethhdr *veth_hdr = (struct vlan_ethhdr *)pkt_data;
u16 vlan_tci = 0;
udp_hdr = (struct udphdr *)(pkt_data +
sizeof(struct vlan_ethhdr) +
sizeof(struct iphdr));
ip_hdr = (struct iphdr *)(pkt_data +
sizeof(struct vlan_ethhdr));
pktgen_hdr = (struct pktgen_hdr *)(pkt_data +
sizeof(struct vlan_ethhdr) +
sizeof(struct iphdr) +
sizeof(struct udphdr));
/* ethernet & VLAN header */
memcpy(veth_hdr->h_dest, &opt_txdmac, ETH_ALEN);
memcpy(veth_hdr->h_source, &opt_txsmac, ETH_ALEN);
veth_hdr->h_vlan_proto = htons(ETH_P_8021Q);
vlan_tci = opt_pkt_vlan_id & VLAN_VID_MASK;
vlan_tci |= (opt_pkt_vlan_pri << VLAN_PRIO_SHIFT) & VLAN_PRIO_MASK;
veth_hdr->h_vlan_TCI = htons(vlan_tci);
veth_hdr->h_vlan_encapsulated_proto = htons(ETH_P_IP);
} else {
struct ethhdr *eth_hdr = (struct ethhdr *)pkt_data;
udp_hdr = (struct udphdr *)(pkt_data +
sizeof(struct ethhdr) +
sizeof(struct iphdr));
ip_hdr = (struct iphdr *)(pkt_data +
sizeof(struct ethhdr));
pktgen_hdr = (struct pktgen_hdr *)(pkt_data +
sizeof(struct ethhdr) + sizeof(struct ethhdr) +
sizeof(struct iphdr)); sizeof(struct iphdr) +
struct iphdr *ip_hdr = (struct iphdr *)(pkt_data + sizeof(struct udphdr));
sizeof(struct ethhdr)); /* ethernet header */
struct ethhdr *eth_hdr = (struct ethhdr *)pkt_data; memcpy(eth_hdr->h_dest, &opt_txdmac, ETH_ALEN);
memcpy(eth_hdr->h_source, &opt_txsmac, ETH_ALEN);
eth_hdr->h_proto = htons(ETH_P_IP);
}
/* ethernet header */
memcpy(eth_hdr->h_dest, "\x3c\xfd\xfe\x9e\x7f\x71", ETH_ALEN);
memcpy(eth_hdr->h_source, "\xec\xb1\xd7\x98\x3a\xc0", ETH_ALEN);
eth_hdr->h_proto = htons(ETH_P_IP);
/* IP header */ /* IP header */
ip_hdr->version = IPVERSION; ip_hdr->version = IPVERSION;
...@@ -785,6 +929,9 @@ static void gen_eth_hdr_data(void) ...@@ -785,6 +929,9 @@ static void gen_eth_hdr_data(void)
udp_hdr->dest = htons(0x1000); udp_hdr->dest = htons(0x1000);
udp_hdr->len = htons(UDP_PKT_SIZE); udp_hdr->len = htons(UDP_PKT_SIZE);
if (opt_tstamp)
pktgen_hdr->pgh_magic = htonl(PKTGEN_MAGIC);
/* UDP data */ /* UDP data */
memset32_htonl(pkt_data + PKT_HDR_SIZE, opt_pkt_fill_pattern, memset32_htonl(pkt_data + PKT_HDR_SIZE, opt_pkt_fill_pattern,
UDP_PKT_DATA_SIZE); UDP_PKT_DATA_SIZE);
...@@ -908,6 +1055,7 @@ static struct option long_options[] = { ...@@ -908,6 +1055,7 @@ static struct option long_options[] = {
{"xdp-skb", no_argument, 0, 'S'}, {"xdp-skb", no_argument, 0, 'S'},
{"xdp-native", no_argument, 0, 'N'}, {"xdp-native", no_argument, 0, 'N'},
{"interval", required_argument, 0, 'n'}, {"interval", required_argument, 0, 'n'},
{"retries", required_argument, 0, 'O'},
{"zero-copy", no_argument, 0, 'z'}, {"zero-copy", no_argument, 0, 'z'},
{"copy", no_argument, 0, 'c'}, {"copy", no_argument, 0, 'c'},
{"frame-size", required_argument, 0, 'f'}, {"frame-size", required_argument, 0, 'f'},
...@@ -916,10 +1064,20 @@ static struct option long_options[] = { ...@@ -916,10 +1064,20 @@ static struct option long_options[] = {
{"shared-umem", no_argument, 0, 'M'}, {"shared-umem", no_argument, 0, 'M'},
{"force", no_argument, 0, 'F'}, {"force", no_argument, 0, 'F'},
{"duration", required_argument, 0, 'd'}, {"duration", required_argument, 0, 'd'},
{"clock", required_argument, 0, 'w'},
{"batch-size", required_argument, 0, 'b'}, {"batch-size", required_argument, 0, 'b'},
{"tx-pkt-count", required_argument, 0, 'C'}, {"tx-pkt-count", required_argument, 0, 'C'},
{"tx-pkt-size", required_argument, 0, 's'}, {"tx-pkt-size", required_argument, 0, 's'},
{"tx-pkt-pattern", required_argument, 0, 'P'}, {"tx-pkt-pattern", required_argument, 0, 'P'},
{"tx-vlan", no_argument, 0, 'V'},
{"tx-vlan-id", required_argument, 0, 'J'},
{"tx-vlan-pri", required_argument, 0, 'K'},
{"tx-dmac", required_argument, 0, 'G'},
{"tx-smac", required_argument, 0, 'H'},
{"tx-cycle", required_argument, 0, 'T'},
{"tstamp", no_argument, 0, 'y'},
{"policy", required_argument, 0, 'W'},
{"schpri", required_argument, 0, 'U'},
{"extra-stats", no_argument, 0, 'x'}, {"extra-stats", no_argument, 0, 'x'},
{"quiet", no_argument, 0, 'Q'}, {"quiet", no_argument, 0, 'Q'},
{"app-stats", no_argument, 0, 'a'}, {"app-stats", no_argument, 0, 'a'},
...@@ -943,6 +1101,7 @@ static void usage(const char *prog) ...@@ -943,6 +1101,7 @@ static void usage(const char *prog)
" -S, --xdp-skb=n Use XDP skb-mod\n" " -S, --xdp-skb=n Use XDP skb-mod\n"
" -N, --xdp-native=n Enforce XDP native mode\n" " -N, --xdp-native=n Enforce XDP native mode\n"
" -n, --interval=n Specify statistics update interval (default 1 sec).\n" " -n, --interval=n Specify statistics update interval (default 1 sec).\n"
" -O, --retries=n Specify time-out retries (1s interval) attempt (default 3).\n"
" -z, --zero-copy Force zero-copy mode.\n" " -z, --zero-copy Force zero-copy mode.\n"
" -c, --copy Force copy mode.\n" " -c, --copy Force copy mode.\n"
" -m, --no-need-wakeup Turn off use of driver need wakeup flag.\n" " -m, --no-need-wakeup Turn off use of driver need wakeup flag.\n"
...@@ -952,6 +1111,7 @@ static void usage(const char *prog) ...@@ -952,6 +1111,7 @@ static void usage(const char *prog)
" -F, --force Force loading the XDP prog\n" " -F, --force Force loading the XDP prog\n"
" -d, --duration=n Duration in secs to run command.\n" " -d, --duration=n Duration in secs to run command.\n"
" Default: forever.\n" " Default: forever.\n"
" -w, --clock=CLOCK Clock NAME (default MONOTONIC).\n"
" -b, --batch-size=n Batch size for sending or receiving\n" " -b, --batch-size=n Batch size for sending or receiving\n"
" packets. Default: %d\n" " packets. Default: %d\n"
" -C, --tx-pkt-count=n Number of packets to send.\n" " -C, --tx-pkt-count=n Number of packets to send.\n"
...@@ -960,6 +1120,15 @@ static void usage(const char *prog) ...@@ -960,6 +1120,15 @@ static void usage(const char *prog)
" (Default: %d bytes)\n" " (Default: %d bytes)\n"
" Min size: %d, Max size %d.\n" " Min size: %d, Max size %d.\n"
" -P, --tx-pkt-pattern=nPacket fill pattern. Default: 0x%x\n" " -P, --tx-pkt-pattern=nPacket fill pattern. Default: 0x%x\n"
" -V, --tx-vlan Send VLAN tagged packets (For -t|--txonly)\n"
" -J, --tx-vlan-id=n Tx VLAN ID [1-4095]. Default: %d (For -V|--tx-vlan)\n"
" -K, --tx-vlan-pri=n Tx VLAN Priority [0-7]. Default: %d (For -V|--tx-vlan)\n"
" -G, --tx-dmac=<MAC> Dest MAC addr of TX frame in aa:bb:cc:dd:ee:ff format (For -V|--tx-vlan)\n"
" -H, --tx-smac=<MAC> Src MAC addr of TX frame in aa:bb:cc:dd:ee:ff format (For -V|--tx-vlan)\n"
" -T, --tx-cycle=n Tx cycle time in micro-seconds (For -t|--txonly).\n"
" -y, --tstamp Add time-stamp to packet (For -t|--txonly).\n"
" -W, --policy=POLICY Schedule policy. Default: SCHED_OTHER\n"
" -U, --schpri=n Schedule priority. Default: %d\n"
" -x, --extra-stats Display extra statistics.\n" " -x, --extra-stats Display extra statistics.\n"
" -Q, --quiet Do not display any stats.\n" " -Q, --quiet Do not display any stats.\n"
" -a, --app-stats Display application (syscall) statistics.\n" " -a, --app-stats Display application (syscall) statistics.\n"
...@@ -969,7 +1138,9 @@ static void usage(const char *prog) ...@@ -969,7 +1138,9 @@ static void usage(const char *prog)
"\n"; "\n";
fprintf(stderr, str, prog, XSK_UMEM__DEFAULT_FRAME_SIZE, fprintf(stderr, str, prog, XSK_UMEM__DEFAULT_FRAME_SIZE,
opt_batch_size, MIN_PKT_SIZE, MIN_PKT_SIZE, opt_batch_size, MIN_PKT_SIZE, MIN_PKT_SIZE,
XSK_UMEM__DEFAULT_FRAME_SIZE, opt_pkt_fill_pattern); XSK_UMEM__DEFAULT_FRAME_SIZE, opt_pkt_fill_pattern,
VLAN_VID__DEFAULT, VLAN_PRI__DEFAULT,
SCHED_PRI__DEFAULT);
exit(EXIT_FAILURE); exit(EXIT_FAILURE);
} }
...@@ -981,7 +1152,8 @@ static void parse_command_line(int argc, char **argv) ...@@ -981,7 +1152,8 @@ static void parse_command_line(int argc, char **argv)
opterr = 0; opterr = 0;
for (;;) { for (;;) {
c = getopt_long(argc, argv, "Frtli:q:pSNn:czf:muMd:b:C:s:P:xQaI:BR", c = getopt_long(argc, argv,
"Frtli:q:pSNn:w:O:czf:muMd:b:C:s:P:VJ:K:G:H:T:yW:U:xQaI:BR",
long_options, &option_index); long_options, &option_index);
if (c == -1) if (c == -1)
break; break;
...@@ -1015,6 +1187,17 @@ static void parse_command_line(int argc, char **argv) ...@@ -1015,6 +1187,17 @@ static void parse_command_line(int argc, char **argv)
case 'n': case 'n':
opt_interval = atoi(optarg); opt_interval = atoi(optarg);
break; break;
case 'w':
if (get_clockid(&opt_clock, optarg)) {
fprintf(stderr,
"ERROR: Invalid clock %s. Default to CLOCK_MONOTONIC.\n",
optarg);
opt_clock = CLOCK_MONOTONIC;
}
break;
case 'O':
opt_retries = atoi(optarg);
break;
case 'z': case 'z':
opt_xdp_bind_flags |= XDP_ZEROCOPY; opt_xdp_bind_flags |= XDP_ZEROCOPY;
break; break;
...@@ -1062,6 +1245,49 @@ static void parse_command_line(int argc, char **argv) ...@@ -1062,6 +1245,49 @@ static void parse_command_line(int argc, char **argv)
case 'P': case 'P':
opt_pkt_fill_pattern = strtol(optarg, NULL, 16); opt_pkt_fill_pattern = strtol(optarg, NULL, 16);
break; break;
case 'V':
opt_vlan_tag = true;
break;
case 'J':
opt_pkt_vlan_id = atoi(optarg);
break;
case 'K':
opt_pkt_vlan_pri = atoi(optarg);
break;
case 'G':
if (!ether_aton_r(optarg,
(struct ether_addr *)&opt_txdmac)) {
fprintf(stderr, "Invalid dmac address:%s\n",
optarg);
usage(basename(argv[0]));
}
break;
case 'H':
if (!ether_aton_r(optarg,
(struct ether_addr *)&opt_txsmac)) {
fprintf(stderr, "Invalid smac address:%s\n",
optarg);
usage(basename(argv[0]));
}
break;
case 'T':
opt_tx_cycle_ns = atoi(optarg);
opt_tx_cycle_ns *= NSEC_PER_USEC;
break;
case 'y':
opt_tstamp = 1;
break;
case 'W':
if (get_schpolicy(&opt_schpolicy, optarg)) {
fprintf(stderr,
"ERROR: Invalid policy %s. Default to SCHED_OTHER.\n",
optarg);
opt_schpolicy = SCHED_OTHER;
}
break;
case 'U':
opt_schprio = atoi(optarg);
break;
case 'x': case 'x':
opt_extra_stats = 1; opt_extra_stats = 1;
break; break;
...@@ -1267,16 +1493,22 @@ static void rx_drop_all(void) ...@@ -1267,16 +1493,22 @@ static void rx_drop_all(void)
} }
} }
static void tx_only(struct xsk_socket_info *xsk, u32 *frame_nb, int batch_size) static int tx_only(struct xsk_socket_info *xsk, u32 *frame_nb,
int batch_size, unsigned long tx_ns)
{ {
u32 idx; u32 idx, tv_sec, tv_usec;
unsigned int i; unsigned int i;
while (xsk_ring_prod__reserve(&xsk->tx, batch_size, &idx) < while (xsk_ring_prod__reserve(&xsk->tx, batch_size, &idx) <
batch_size) { batch_size) {
complete_tx_only(xsk, batch_size); complete_tx_only(xsk, batch_size);
if (benchmark_done) if (benchmark_done)
return; return 0;
}
if (opt_tstamp) {
tv_sec = (u32)(tx_ns / NSEC_PER_SEC);
tv_usec = (u32)((tx_ns % NSEC_PER_SEC) / 1000);
} }
for (i = 0; i < batch_size; i++) { for (i = 0; i < batch_size; i++) {
...@@ -1284,6 +1516,21 @@ static void tx_only(struct xsk_socket_info *xsk, u32 *frame_nb, int batch_size) ...@@ -1284,6 +1516,21 @@ static void tx_only(struct xsk_socket_info *xsk, u32 *frame_nb, int batch_size)
idx + i); idx + i);
tx_desc->addr = (*frame_nb + i) * opt_xsk_frame_size; tx_desc->addr = (*frame_nb + i) * opt_xsk_frame_size;
tx_desc->len = PKT_SIZE; tx_desc->len = PKT_SIZE;
if (opt_tstamp) {
struct pktgen_hdr *pktgen_hdr;
u64 addr = tx_desc->addr;
char *pkt;
pkt = xsk_umem__get_data(xsk->umem->buffer, addr);
pktgen_hdr = (struct pktgen_hdr *)(pkt + PKTGEN_HDR_OFFSET);
pktgen_hdr->seq_num = htonl(sequence++);
pktgen_hdr->tv_sec = htonl(tv_sec);
pktgen_hdr->tv_usec = htonl(tv_usec);
hex_dump(pkt, PKT_SIZE, addr);
}
} }
xsk_ring_prod__submit(&xsk->tx, batch_size); xsk_ring_prod__submit(&xsk->tx, batch_size);
...@@ -1292,6 +1539,8 @@ static void tx_only(struct xsk_socket_info *xsk, u32 *frame_nb, int batch_size) ...@@ -1292,6 +1539,8 @@ static void tx_only(struct xsk_socket_info *xsk, u32 *frame_nb, int batch_size)
*frame_nb += batch_size; *frame_nb += batch_size;
*frame_nb %= NUM_FRAMES; *frame_nb %= NUM_FRAMES;
complete_tx_only(xsk, batch_size); complete_tx_only(xsk, batch_size);
return batch_size;
} }
static inline int get_batch_size(int pkt_cnt) static inline int get_batch_size(int pkt_cnt)
...@@ -1318,23 +1567,48 @@ static void complete_tx_only_all(void) ...@@ -1318,23 +1567,48 @@ static void complete_tx_only_all(void)
pending = !!xsks[i]->outstanding_tx; pending = !!xsks[i]->outstanding_tx;
} }
} }
} while (pending); sleep(1);
} while (pending && opt_retries-- > 0);
} }
static void tx_only_all(void) static void tx_only_all(void)
{ {
struct pollfd fds[MAX_SOCKS] = {}; struct pollfd fds[MAX_SOCKS] = {};
u32 frame_nb[MAX_SOCKS] = {}; u32 frame_nb[MAX_SOCKS] = {};
unsigned long next_tx_ns = 0;
int pkt_cnt = 0; int pkt_cnt = 0;
int i, ret; int i, ret;
if (opt_poll && opt_tx_cycle_ns) {
fprintf(stderr,
"Error: --poll and --tx-cycles are both set\n");
return;
}
for (i = 0; i < num_socks; i++) { for (i = 0; i < num_socks; i++) {
fds[0].fd = xsk_socket__fd(xsks[i]->xsk); fds[0].fd = xsk_socket__fd(xsks[i]->xsk);
fds[0].events = POLLOUT; fds[0].events = POLLOUT;
} }
if (opt_tx_cycle_ns) {
/* Align Tx time to micro-second boundary */
next_tx_ns = (get_nsecs() / NSEC_PER_USEC + 1) *
NSEC_PER_USEC;
next_tx_ns += opt_tx_cycle_ns;
/* Initialize periodic Tx scheduling variance */
tx_cycle_diff_min = 1000000000;
tx_cycle_diff_max = 0;
tx_cycle_diff_ave = 0.0;
}
while ((opt_pkt_count && pkt_cnt < opt_pkt_count) || !opt_pkt_count) { while ((opt_pkt_count && pkt_cnt < opt_pkt_count) || !opt_pkt_count) {
int batch_size = get_batch_size(pkt_cnt); int batch_size = get_batch_size(pkt_cnt);
unsigned long tx_ns = 0;
struct timespec next;
int tx_cnt = 0;
long diff;
int err;
if (opt_poll) { if (opt_poll) {
for (i = 0; i < num_socks; i++) for (i = 0; i < num_socks; i++)
...@@ -1347,13 +1621,43 @@ static void tx_only_all(void) ...@@ -1347,13 +1621,43 @@ static void tx_only_all(void)
continue; continue;
} }
if (opt_tx_cycle_ns) {
next.tv_sec = next_tx_ns / NSEC_PER_SEC;
next.tv_nsec = next_tx_ns % NSEC_PER_SEC;
err = clock_nanosleep(opt_clock, TIMER_ABSTIME, &next, NULL);
if (err) {
if (err != EINTR)
fprintf(stderr,
"clock_nanosleep failed. Err:%d errno:%d\n",
err, errno);
break;
}
/* Measure periodic Tx scheduling variance */
tx_ns = get_nsecs();
diff = tx_ns - next_tx_ns;
if (diff < tx_cycle_diff_min)
tx_cycle_diff_min = diff;
if (diff > tx_cycle_diff_max)
tx_cycle_diff_max = diff;
tx_cycle_diff_ave += (double)diff;
tx_cycle_cnt++;
} else if (opt_tstamp) {
tx_ns = get_nsecs();
}
for (i = 0; i < num_socks; i++) for (i = 0; i < num_socks; i++)
tx_only(xsks[i], &frame_nb[i], batch_size); tx_cnt += tx_only(xsks[i], &frame_nb[i], batch_size, tx_ns);
pkt_cnt += batch_size; pkt_cnt += tx_cnt;
if (benchmark_done) if (benchmark_done)
break; break;
if (opt_tx_cycle_ns)
next_tx_ns += opt_tx_cycle_ns;
} }
if (opt_pkt_count) if (opt_pkt_count)
...@@ -1584,6 +1888,7 @@ int main(int argc, char **argv) ...@@ -1584,6 +1888,7 @@ int main(int argc, char **argv)
struct __user_cap_data_struct data[2] = { { 0 } }; struct __user_cap_data_struct data[2] = { { 0 } };
struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY}; struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
bool rx = false, tx = false; bool rx = false, tx = false;
struct sched_param schparam;
struct xsk_umem_info *umem; struct xsk_umem_info *umem;
struct bpf_object *obj; struct bpf_object *obj;
int xsks_map_fd = 0; int xsks_map_fd = 0;
...@@ -1646,6 +1951,9 @@ int main(int argc, char **argv) ...@@ -1646,6 +1951,9 @@ int main(int argc, char **argv)
apply_setsockopt(xsks[i]); apply_setsockopt(xsks[i]);
if (opt_bench == BENCH_TXONLY) { if (opt_bench == BENCH_TXONLY) {
if (opt_tstamp && opt_pkt_size < PKTGEN_SIZE_MIN)
opt_pkt_size = PKTGEN_SIZE_MIN;
gen_eth_hdr_data(); gen_eth_hdr_data();
for (i = 0; i < NUM_FRAMES; i++) for (i = 0; i < NUM_FRAMES; i++)
...@@ -1685,6 +1993,16 @@ int main(int argc, char **argv) ...@@ -1685,6 +1993,16 @@ int main(int argc, char **argv)
prev_time = get_nsecs(); prev_time = get_nsecs();
start_time = prev_time; start_time = prev_time;
/* Configure sched priority for better wake-up accuracy */
memset(&schparam, 0, sizeof(schparam));
schparam.sched_priority = opt_schprio;
ret = sched_setscheduler(0, opt_schpolicy, &schparam);
if (ret) {
fprintf(stderr, "Error(%d) in setting priority(%d): %s\n",
errno, opt_schprio, strerror(errno));
goto out;
}
if (opt_bench == BENCH_RXDROP) if (opt_bench == BENCH_RXDROP)
rx_drop_all(); rx_drop_all();
else if (opt_bench == BENCH_TXONLY) else if (opt_bench == BENCH_TXONLY)
...@@ -1692,6 +2010,7 @@ int main(int argc, char **argv) ...@@ -1692,6 +2010,7 @@ int main(int argc, char **argv)
else else
l2fwd_all(); l2fwd_all();
out:
benchmark_done = true; benchmark_done = true;
if (!opt_quiet) if (!opt_quiet)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment