• Xiaoguang Wang's avatar
    io_uring: io_uring_enter(2) don't poll while SETUP_IOPOLL|SETUP_SQPOLL enabled · 32b2244a
    Xiaoguang Wang authored
    When SETUP_IOPOLL and SETUP_SQPOLL are both enabled, applications don't need
    to do io completion events polling again, they can rely on io_sq_thread to do
    polling work, which can reduce cpu usage and uring_lock contention.
    
    I modify fio io_uring engine codes a bit to evaluate the performance:
    static int fio_ioring_getevents(struct thread_data *td, unsigned int min,
                            continue;
                    }
    
    -               if (!o->sqpoll_thread) {
    +               if (o->sqpoll_thread && o->hipri) {
                            r = io_uring_enter(ld, 0, actual_min,
                                                    IORING_ENTER_GETEVENTS);
                            if (r < 0) {
    
    and use "fio  -name=fiotest -filename=/dev/nvme0n1 -iodepth=$depth -thread
    -rw=read -ioengine=io_uring  -hipri=1 -sqthread_poll=1  -direct=1 -bs=4k
    -size=10G -numjobs=1  -time_based -runtime=120"
    
    original codes
    --------------------------------------------------------------------
    iodepth       |        4 |        8 |       16 |       32 |       64
    bw            | 1133MB/s | 1519MB/s | 2090MB/s | 2710MB/s | 3012MB/s
    fio cpu usage |     100% |     100% |     100% |     100% |     100%
    --------------------------------------------------------------------
    
    with patch
    --------------------------------------------------------------------
    iodepth       |        4 |        8 |       16 |       32 |       64
    bw            | 1196MB/s | 1721MB/s | 2351MB/s | 2977MB/s | 3357MB/s
    fio cpu usage |    63.8% |   74.4%% |    81.1% |    83.7% |    82.4%
    --------------------------------------------------------------------
    bw improve    |     5.5% |    13.2% |    12.3% |     9.8% |    11.5%
    --------------------------------------------------------------------
    
    From above test results, we can see that bw has above 5.5%~13%
    improvement, and fio process's cpu usage also drops much. Note this
    won't improve io_sq_thread's cpu usage when SETUP_IOPOLL|SETUP_SQPOLL
    are both enabled, in this case, io_sq_thread always has 100% cpu usage.
    I think this patch will be friendly to applications which will often use
    io_uring_wait_cqe() or similar from liburing.
    Signed-off-by: default avatarXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
    Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
    32b2244a
io_uring.c 187 KB