Commit 36e68ead authored by Jakub Kicinski's avatar Jakub Kicinski

Merge branch 'docs-net-page_pool-sync-dev-and-kdoc'

Jakub Kicinski says:

====================
docs: net: page_pool: sync dev and kdoc

Document PP_FLAG_DMA_SYNC_DEV based on recent conversation.
Use kdoc to document structs and functions, to avoid duplication.

Olek, this will conflict with your work, but I think that trying
to make progress in parallel is the best course of action...
Retargetting at net-next to make it a little less bad.
====================

Link: https://lore.kernel.org/r/20230802161821.3621985-1-kuba@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
parents 39868926 82e896d9
...@@ -64,84 +64,68 @@ This lockless guarantee naturally comes from running under a NAPI softirq. ...@@ -64,84 +64,68 @@ This lockless guarantee naturally comes from running under a NAPI softirq.
The protection doesn't strictly have to be NAPI, any guarantee that allocating The protection doesn't strictly have to be NAPI, any guarantee that allocating
a page will cause no race conditions is enough. a page will cause no race conditions is enough.
* page_pool_create(): Create a pool. .. kernel-doc:: net/core/page_pool.c
* flags: PP_FLAG_DMA_MAP, PP_FLAG_DMA_SYNC_DEV :identifiers: page_pool_create
* order: 2^order pages on allocation
* pool_size: size of the ptr_ring .. kernel-doc:: include/net/page_pool.h
* nid: preferred NUMA node for allocation :identifiers: struct page_pool_params
* dev: struct device. Used on DMA operations
* dma_dir: DMA direction .. kernel-doc:: include/net/page_pool.h
* max_len: max DMA sync memory size :identifiers: page_pool_put_page page_pool_put_full_page
* offset: DMA address offset page_pool_recycle_direct page_pool_dev_alloc_pages
page_pool_get_dma_addr page_pool_get_dma_dir
* page_pool_put_page(): The outcome of this depends on the page refcnt. If the
driver bumps the refcnt > 1 this will unmap the page. If the page refcnt is 1 .. kernel-doc:: net/core/page_pool.c
the allocator owns the page and will try to recycle it in one of the pool :identifiers: page_pool_put_page_bulk page_pool_get_stats
caches. If PP_FLAG_DMA_SYNC_DEV is set, the page will be synced for_device
using dma_sync_single_range_for_device(). DMA sync
--------
* page_pool_put_full_page(): Similar to page_pool_put_page(), but will DMA sync Driver is always responsible for syncing the pages for the CPU.
for the entire memory area configured in area pool->max_len. Drivers may choose to take care of syncing for the device as well
or set the ``PP_FLAG_DMA_SYNC_DEV`` flag to request that pages
* page_pool_recycle_direct(): Similar to page_pool_put_full_page() but caller allocated from the page pool are already synced for the device.
must guarantee safe context (e.g NAPI), since it will recycle the page
directly into the pool fast cache. If ``PP_FLAG_DMA_SYNC_DEV`` is set, the driver must inform the core what portion
of the buffer has to be synced. This allows the core to avoid syncing the entire
* page_pool_dev_alloc_pages(): Get a page from the page allocator or page_pool page when the drivers knows that the device only accessed a portion of the page.
caches.
Most drivers will reserve headroom in front of the frame. This part
* page_pool_get_dma_addr(): Retrieve the stored DMA address. of the buffer is not touched by the device, so to avoid syncing
it drivers can set the ``offset`` field in struct page_pool_params
* page_pool_get_dma_dir(): Retrieve the stored DMA direction. appropriately.
* page_pool_put_page_bulk(): Tries to refill a number of pages into the For pages recycled on the XDP xmit and skb paths the page pool will
ptr_ring cache holding ptr_ring producer lock. If the ptr_ring is full, use the ``max_len`` member of struct page_pool_params to decide how
page_pool_put_page_bulk() will release leftover pages to the page allocator. much of the page needs to be synced (starting at ``offset``).
page_pool_put_page_bulk() is suitable to be run inside the driver NAPI tx When directly freeing pages in the driver (page_pool_put_page())
completion loop for the XDP_REDIRECT use case. the ``dma_sync_size`` argument specifies how much of the buffer needs
Please note the caller must not use data area after running to be synced.
page_pool_put_page_bulk(), as this function overwrites it.
If in doubt set ``offset`` to 0, ``max_len`` to ``PAGE_SIZE`` and
* page_pool_get_stats(): Retrieve statistics about the page_pool. This API pass -1 as ``dma_sync_size``. That combination of arguments is always
is only available if the kernel has been configured with correct.
``CONFIG_PAGE_POOL_STATS=y``. A pointer to a caller allocated ``struct
page_pool_stats`` structure is passed to this API which is filled in. The Note that the syncing parameters are for the entire page.
caller can then report those stats to the user (perhaps via ethtool, This is important to remember when using fragments (``PP_FLAG_PAGE_FRAG``),
debugfs, etc.). See below for an example usage of this API. where allocated buffers may be smaller than a full page.
Unless the driver author really understands page pool internals
it's recommended to always use ``offset = 0``, ``max_len = PAGE_SIZE``
with fragmented page pools.
Stats API and structures Stats API and structures
------------------------ ------------------------
If the kernel is configured with ``CONFIG_PAGE_POOL_STATS=y``, the API If the kernel is configured with ``CONFIG_PAGE_POOL_STATS=y``, the API
``page_pool_get_stats()`` and structures described below are available. It page_pool_get_stats() and structures described below are available.
takes a pointer to a ``struct page_pool`` and a pointer to a ``struct It takes a pointer to a ``struct page_pool`` and a pointer to a struct
page_pool_stats`` allocated by the caller. page_pool_stats allocated by the caller.
The API will fill in the provided ``struct page_pool_stats`` with The API will fill in the provided struct page_pool_stats with
statistics about the page_pool. statistics about the page_pool.
The stats structure has the following fields:: .. kernel-doc:: include/net/page_pool.h
:identifiers: struct page_pool_recycle_stats
struct page_pool_stats { struct page_pool_alloc_stats
struct page_pool_alloc_stats alloc_stats; struct page_pool_stats
struct page_pool_recycle_stats recycle_stats;
};
The ``struct page_pool_alloc_stats`` has the following fields:
* ``fast``: successful fast path allocations
* ``slow``: slow path order-0 allocations
* ``slow_high_order``: slow path high order allocations
* ``empty``: ptr ring is empty, so a slow path allocation was forced.
* ``refill``: an allocation which triggered a refill of the cache
* ``waive``: pages obtained from the ptr ring that cannot be added to
the cache due to a NUMA mismatch.
The ``struct page_pool_recycle_stats`` has the following fields:
* ``cached``: recycling placed page in the page pool cache
* ``cache_full``: page pool cache was full
* ``ring``: page placed into the ptr ring
* ``ring_full``: page released from page pool because the ptr ring was full
* ``released_refcnt``: page released (and not recycled) because refcnt > 1
Coding examples Coding examples
=============== ===============
......
...@@ -70,47 +70,76 @@ struct pp_alloc_cache { ...@@ -70,47 +70,76 @@ struct pp_alloc_cache {
struct page *cache[PP_ALLOC_CACHE_SIZE]; struct page *cache[PP_ALLOC_CACHE_SIZE];
}; };
/**
* struct page_pool_params - page pool parameters
* @flags: PP_FLAG_DMA_MAP, PP_FLAG_DMA_SYNC_DEV, PP_FLAG_PAGE_FRAG
* @order: 2^order pages on allocation
* @pool_size: size of the ptr_ring
* @nid: NUMA node id to allocate from pages from
* @dev: device, for DMA pre-mapping purposes
* @napi: NAPI which is the sole consumer of pages, otherwise NULL
* @dma_dir: DMA mapping direction
* @max_len: max DMA sync memory size for PP_FLAG_DMA_SYNC_DEV
* @offset: DMA sync address offset for PP_FLAG_DMA_SYNC_DEV
*/
struct page_pool_params { struct page_pool_params {
unsigned int flags; unsigned int flags;
unsigned int order; unsigned int order;
unsigned int pool_size; unsigned int pool_size;
int nid; /* Numa node id to allocate from pages from */ int nid;
struct device *dev; /* device, for DMA pre-mapping purposes */ struct device *dev;
struct napi_struct *napi; /* Sole consumer of pages, otherwise NULL */ struct napi_struct *napi;
enum dma_data_direction dma_dir; /* DMA mapping direction */ enum dma_data_direction dma_dir;
unsigned int max_len; /* max DMA sync memory size */ unsigned int max_len;
unsigned int offset; /* DMA addr offset */ unsigned int offset;
/* private: used by test code only */
void (*init_callback)(struct page *page, void *arg); void (*init_callback)(struct page *page, void *arg);
void *init_arg; void *init_arg;
}; };
#ifdef CONFIG_PAGE_POOL_STATS #ifdef CONFIG_PAGE_POOL_STATS
/**
* struct page_pool_alloc_stats - allocation statistics
* @fast: successful fast path allocations
* @slow: slow path order-0 allocations
* @slow_high_order: slow path high order allocations
* @empty: ptr ring is empty, so a slow path allocation was forced
* @refill: an allocation which triggered a refill of the cache
* @waive: pages obtained from the ptr ring that cannot be added to
* the cache due to a NUMA mismatch
*/
struct page_pool_alloc_stats { struct page_pool_alloc_stats {
u64 fast; /* fast path allocations */ u64 fast;
u64 slow; /* slow-path order 0 allocations */ u64 slow;
u64 slow_high_order; /* slow-path high order allocations */ u64 slow_high_order;
u64 empty; /* failed refills due to empty ptr ring, forcing u64 empty;
* slow path allocation u64 refill;
*/ u64 waive;
u64 refill; /* allocations via successful refill */
u64 waive; /* failed refills due to numa zone mismatch */
}; };
/**
* struct page_pool_recycle_stats - recycling (freeing) statistics
* @cached: recycling placed page in the page pool cache
* @cache_full: page pool cache was full
* @ring: page placed into the ptr ring
* @ring_full: page released from page pool because the ptr ring was full
* @released_refcnt: page released (and not recycled) because refcnt > 1
*/
struct page_pool_recycle_stats { struct page_pool_recycle_stats {
u64 cached; /* recycling placed page in the cache. */ u64 cached;
u64 cache_full; /* cache was full */ u64 cache_full;
u64 ring; /* recycling placed page back into ptr ring */ u64 ring;
u64 ring_full; /* page was released from page-pool because u64 ring_full;
* PTR ring was full. u64 released_refcnt;
*/
u64 released_refcnt; /* page released because of elevated
* refcnt
*/
}; };
/* This struct wraps the above stats structs so users of the /**
* page_pool_get_stats API can pass a single argument when requesting the * struct page_pool_stats - combined page pool use statistics
* stats for the page pool. * @alloc_stats: see struct page_pool_alloc_stats
* @recycle_stats: see struct page_pool_recycle_stats
*
* Wrapper struct for combining page pool stats with different storage
* requirements.
*/ */
struct page_pool_stats { struct page_pool_stats {
struct page_pool_alloc_stats alloc_stats; struct page_pool_alloc_stats alloc_stats;
...@@ -211,6 +240,12 @@ struct page_pool { ...@@ -211,6 +240,12 @@ struct page_pool {
struct page *page_pool_alloc_pages(struct page_pool *pool, gfp_t gfp); struct page *page_pool_alloc_pages(struct page_pool *pool, gfp_t gfp);
/**
* page_pool_dev_alloc_pages() - allocate a page.
* @pool: pool from which to allocate
*
* Get a page from the page allocator or page_pool caches.
*/
static inline struct page *page_pool_dev_alloc_pages(struct page_pool *pool) static inline struct page *page_pool_dev_alloc_pages(struct page_pool *pool)
{ {
gfp_t gfp = (GFP_ATOMIC | __GFP_NOWARN); gfp_t gfp = (GFP_ATOMIC | __GFP_NOWARN);
...@@ -230,8 +265,12 @@ static inline struct page *page_pool_dev_alloc_frag(struct page_pool *pool, ...@@ -230,8 +265,12 @@ static inline struct page *page_pool_dev_alloc_frag(struct page_pool *pool,
return page_pool_alloc_frag(pool, offset, size, gfp); return page_pool_alloc_frag(pool, offset, size, gfp);
} }
/* get the stored dma direction. A driver might decide to treat this locally and /**
* avoid the extra cache line from page_pool to determine the direction * page_pool_get_dma_dir() - Retrieve the stored DMA direction.
* @pool: pool from which page was allocated
*
* Get the stored dma direction. A driver might decide to store this locally
* and avoid the extra cache line from page_pool to determine the direction.
*/ */
static static
inline enum dma_data_direction page_pool_get_dma_dir(struct page_pool *pool) inline enum dma_data_direction page_pool_get_dma_dir(struct page_pool *pool)
...@@ -321,6 +360,19 @@ static inline bool page_pool_is_last_frag(struct page_pool *pool, ...@@ -321,6 +360,19 @@ static inline bool page_pool_is_last_frag(struct page_pool *pool,
(page_pool_defrag_page(page, 1) == 0); (page_pool_defrag_page(page, 1) == 0);
} }
/**
* page_pool_put_page() - release a reference to a page pool page
* @pool: pool from which page was allocated
* @page: page to release a reference on
* @dma_sync_size: how much of the page may have been touched by the device
* @allow_direct: released by the consumer, allow lockless caching
*
* The outcome of this depends on the page refcnt. If the driver bumps
* the refcnt > 1 this will unmap the page. If the page refcnt is 1
* the allocator owns the page and will try to recycle it in one of the pool
* caches. If PP_FLAG_DMA_SYNC_DEV is set, the page will be synced for_device
* using dma_sync_single_range_for_device().
*/
static inline void page_pool_put_page(struct page_pool *pool, static inline void page_pool_put_page(struct page_pool *pool,
struct page *page, struct page *page,
unsigned int dma_sync_size, unsigned int dma_sync_size,
...@@ -337,14 +389,29 @@ static inline void page_pool_put_page(struct page_pool *pool, ...@@ -337,14 +389,29 @@ static inline void page_pool_put_page(struct page_pool *pool,
#endif #endif
} }
/* Same as above but will try to sync the entire area pool->max_len */ /**
* page_pool_put_full_page() - release a reference on a page pool page
* @pool: pool from which page was allocated
* @page: page to release a reference on
* @allow_direct: released by the consumer, allow lockless caching
*
* Similar to page_pool_put_page(), but will DMA sync the entire memory area
* as configured in &page_pool_params.max_len.
*/
static inline void page_pool_put_full_page(struct page_pool *pool, static inline void page_pool_put_full_page(struct page_pool *pool,
struct page *page, bool allow_direct) struct page *page, bool allow_direct)
{ {
page_pool_put_page(pool, page, -1, allow_direct); page_pool_put_page(pool, page, -1, allow_direct);
} }
/* Same as above but the caller must guarantee safe context. e.g NAPI */ /**
* page_pool_recycle_direct() - release a reference on a page pool page
* @pool: pool from which page was allocated
* @page: page to release a reference on
*
* Similar to page_pool_put_full_page() but caller must guarantee safe context
* (e.g NAPI), since it will recycle the page directly into the pool fast cache.
*/
static inline void page_pool_recycle_direct(struct page_pool *pool, static inline void page_pool_recycle_direct(struct page_pool *pool,
struct page *page) struct page *page)
{ {
...@@ -354,6 +421,13 @@ static inline void page_pool_recycle_direct(struct page_pool *pool, ...@@ -354,6 +421,13 @@ static inline void page_pool_recycle_direct(struct page_pool *pool,
#define PAGE_POOL_DMA_USE_PP_FRAG_COUNT \ #define PAGE_POOL_DMA_USE_PP_FRAG_COUNT \
(sizeof(dma_addr_t) > sizeof(unsigned long)) (sizeof(dma_addr_t) > sizeof(unsigned long))
/**
* page_pool_get_dma_addr() - Retrieve the stored DMA address.
* @page: page allocated from a page pool
*
* Fetch the DMA address of the page. The page pool to which the page belongs
* must had been created with PP_FLAG_DMA_MAP.
*/
static inline dma_addr_t page_pool_get_dma_addr(struct page *page) static inline dma_addr_t page_pool_get_dma_addr(struct page *page)
{ {
dma_addr_t ret = page->dma_addr; dma_addr_t ret = page->dma_addr;
......
...@@ -58,6 +58,17 @@ static const char pp_stats[][ETH_GSTRING_LEN] = { ...@@ -58,6 +58,17 @@ static const char pp_stats[][ETH_GSTRING_LEN] = {
"rx_pp_recycle_released_ref", "rx_pp_recycle_released_ref",
}; };
/**
* page_pool_get_stats() - fetch page pool stats
* @pool: pool from which page was allocated
* @stats: struct page_pool_stats to fill in
*
* Retrieve statistics about the page_pool. This API is only available
* if the kernel has been configured with ``CONFIG_PAGE_POOL_STATS=y``.
* A pointer to a caller allocated struct page_pool_stats structure
* is passed to this API which is filled in. The caller can then report
* those stats to the user (perhaps via ethtool, debugfs, etc.).
*/
bool page_pool_get_stats(struct page_pool *pool, bool page_pool_get_stats(struct page_pool *pool,
struct page_pool_stats *stats) struct page_pool_stats *stats)
{ {
...@@ -224,6 +235,10 @@ static int page_pool_init(struct page_pool *pool, ...@@ -224,6 +235,10 @@ static int page_pool_init(struct page_pool *pool,
return 0; return 0;
} }
/**
* page_pool_create() - create a page pool.
* @params: parameters, see struct page_pool_params
*/
struct page_pool *page_pool_create(const struct page_pool_params *params) struct page_pool *page_pool_create(const struct page_pool_params *params)
{ {
struct page_pool *pool; struct page_pool *pool;
...@@ -626,7 +641,21 @@ void page_pool_put_defragged_page(struct page_pool *pool, struct page *page, ...@@ -626,7 +641,21 @@ void page_pool_put_defragged_page(struct page_pool *pool, struct page *page,
} }
EXPORT_SYMBOL(page_pool_put_defragged_page); EXPORT_SYMBOL(page_pool_put_defragged_page);
/* Caller must not use data area after call, as this function overwrites it */ /**
* page_pool_put_page_bulk() - release references on multiple pages
* @pool: pool from which pages were allocated
* @data: array holding page pointers
* @count: number of pages in @data
*
* Tries to refill a number of pages into the ptr_ring cache holding ptr_ring
* producer lock. If the ptr_ring is full, page_pool_put_page_bulk()
* will release leftover pages to the page allocator.
* page_pool_put_page_bulk() is suitable to be run inside the driver NAPI tx
* completion loop for the XDP_REDIRECT use case.
*
* Please note the caller must not use data area after running
* page_pool_put_page_bulk(), as this function overwrites it.
*/
void page_pool_put_page_bulk(struct page_pool *pool, void **data, void page_pool_put_page_bulk(struct page_pool *pool, void **data,
int count) int count)
{ {
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment