Commit 7ed2bc80 authored by Vladimir Oltean's avatar Vladimir Oltean Committed by David S. Miller

net: enetc: add support for XDP_TX

For reflecting packets back into the interface they came from, we create
an array of TX software BDs derived from the RX software BDs. Therefore,
we need to extend the TX software BD structure to contain most of the
stuff that's already present in the RX software BD structure, for
reasons that will become evident in a moment.

For a frame with the XDP_TX verdict, we don't reuse any buffer right
away as we do for XDP_DROP (the same page half) or XDP_PASS (the other
page half, same as the skb code path).

Because the buffer transfers ownership from the RX ring to the TX ring,
reusing any page half right away is very dangerous. So what we can do is
we can recycle the same page half as soon as TX is complete.

The code path is:
enetc_poll
-> enetc_clean_rx_ring_xdp
   -> enetc_xdp_tx
   -> enetc_refill_rx_ring
(time passes, another MSI interrupt is raised)
enetc_poll
-> enetc_clean_tx_ring
   -> enetc_recycle_xdp_tx_buff

But that creates a problem, because there is a potentially large time
window between enetc_xdp_tx and enetc_recycle_xdp_tx_buff, period in
which we'll have less and less RX buffers.

Basically, when the ship starts sinking, the knee-jerk reaction is to
let enetc_refill_rx_ring do what it does for the standard skb code path
(refill every 16 consumed buffers), but that turns out to be very
inefficient. The problem is that we have no rx_swbd->page at our
disposal from the enetc_reuse_page path, so enetc_refill_rx_ring would
have to call enetc_new_page for every buffer that we refill (if we
choose to refill at this early stage). Very inefficient, it only makes
the problem worse, because page allocation is an expensive process, and
CPU time is exactly what we're lacking.

Additionally, there is an even bigger problem: if we let
enetc_refill_rx_ring top up the ring's buffers again from the RX path,
remember that the buffers sent to transmission haven't disappeared
anywhere. They will be eventually sent, and processed in
enetc_clean_tx_ring, and an attempt will be made to recycle them.
But surprise, the RX ring is already full of new buffers, because we
were premature in deciding that we should refill. So not only we took
the expensive decision of allocating new pages, but now we must throw
away perfectly good and reusable buffers.

So what we do is we implement an elastic refill mechanism, which keeps
track of the number of in-flight XDP_TX buffer descriptors. We top up
the RX ring only up to the total ring capacity minus the number of BDs
that are in flight (because we know that those BDs will return to us
eventually).

The enetc driver manages 1 RX ring per CPU, and the default TX ring
management is the same. So we do XDP_TX towards the TX ring of the same
index, because it is affined to the same CPU. This will probably not
produce great results when we have a tc-taprio/tc-mqprio qdisc on the
interface, because in that case, the number of TX rings might be
greater, but I didn't add any checks for that yet (mostly because I
didn't know what checks to add).

It should also be noted that we need to change the DMA mapping direction
for RX buffers, since they may now be reflected into the TX ring of the
same device. We choose to use DMA_BIDIRECTIONAL instead of unmapping and
remapping as DMA_TO_DEVICE, because performance is better this way.
Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parent d1b15102
......@@ -21,11 +21,15 @@
struct enetc_tx_swbd {
struct sk_buff *skb;
dma_addr_t dma;
struct page *page; /* valid only if is_xdp_tx */
u16 page_offset; /* valid only if is_xdp_tx */
u16 len;
enum dma_data_direction dir;
u8 is_dma_page:1;
u8 check_wb:1;
u8 do_tstamp:1;
u8 is_eof:1;
u8 is_xdp_tx:1;
};
#define ENETC_RX_MAXFRM_SIZE ENETC_MAC_MAXFRM_SIZE
......@@ -40,18 +44,31 @@ struct enetc_rx_swbd {
dma_addr_t dma;
struct page *page;
u16 page_offset;
enum dma_data_direction dir;
u16 len;
};
/* ENETC overhead: optional extension BD + 1 BD gap */
#define ENETC_TXBDS_NEEDED(val) ((val) + 2)
/* max # of chained Tx BDs is 15, including head and extension BD */
#define ENETC_MAX_SKB_FRAGS 13
#define ENETC_TXBDS_MAX_NEEDED ENETC_TXBDS_NEEDED(ENETC_MAX_SKB_FRAGS + 1)
struct enetc_ring_stats {
unsigned int packets;
unsigned int bytes;
unsigned int rx_alloc_errs;
unsigned int xdp_drops;
unsigned int xdp_tx;
unsigned int xdp_tx_drops;
unsigned int recycles;
unsigned int recycle_failures;
};
struct enetc_xdp_data {
struct xdp_rxq_info rxq;
struct bpf_prog *prog;
int xdp_tx_in_flight;
};
#define ENETC_RX_RING_DEFAULT_SIZE 512
......@@ -104,6 +121,14 @@ static inline int enetc_bd_unused(struct enetc_bdr *bdr)
return bdr->bd_count + bdr->next_to_clean - bdr->next_to_use - 1;
}
static inline int enetc_swbd_unused(struct enetc_bdr *bdr)
{
if (bdr->next_to_clean > bdr->next_to_alloc)
return bdr->next_to_clean - bdr->next_to_alloc - 1;
return bdr->bd_count + bdr->next_to_clean - bdr->next_to_alloc - 1;
}
/* Control BD ring */
#define ENETC_CBDR_DEFAULT_SIZE 64
struct enetc_cbdr {
......
......@@ -193,10 +193,14 @@ static const char rx_ring_stats[][ETH_GSTRING_LEN] = {
"Rx ring %2d frames",
"Rx ring %2d alloc errors",
"Rx ring %2d XDP drops",
"Rx ring %2d recycles",
"Rx ring %2d recycle failures",
};
static const char tx_ring_stats[][ETH_GSTRING_LEN] = {
"Tx ring %2d frames",
"Tx ring %2d XDP frames",
"Tx ring %2d XDP drops",
};
static int enetc_get_sset_count(struct net_device *ndev, int sset)
......@@ -268,13 +272,18 @@ static void enetc_get_ethtool_stats(struct net_device *ndev,
for (i = 0; i < ARRAY_SIZE(enetc_si_counters); i++)
data[o++] = enetc_rd64(hw, enetc_si_counters[i].reg);
for (i = 0; i < priv->num_tx_rings; i++)
for (i = 0; i < priv->num_tx_rings; i++) {
data[o++] = priv->tx_ring[i]->stats.packets;
data[o++] = priv->tx_ring[i]->stats.xdp_tx;
data[o++] = priv->tx_ring[i]->stats.xdp_tx_drops;
}
for (i = 0; i < priv->num_rx_rings; i++) {
data[o++] = priv->rx_ring[i]->stats.packets;
data[o++] = priv->rx_ring[i]->stats.rx_alloc_errs;
data[o++] = priv->rx_ring[i]->stats.xdp_drops;
data[o++] = priv->rx_ring[i]->stats.recycles;
data[o++] = priv->rx_ring[i]->stats.recycle_failures;
}
if (!enetc_si_is_pf(priv->si))
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment