• Kazuo Ito's avatar
    pNFS: Avoid read/modify/write when it is not necessary · 2cde04e9
    Kazuo Ito authored
    As the block and SCSI layouts can only read/write fixed-length
    blocks, we must perform read-modify-write when data to be written is
    not aligned to a block boundary or smaller than the block size.
    (612aa983 pnfs: add flag to force read-modify-write in ->write_begin)
    
    The current code tries to see if we have to do read-modify-write
    on block-oriented pNFS layouts by just checking !PageUptodate(page),
    but the same condition also applies for overwriting of any uncached
    potions of existing files, making such operations excessively slow
    even it is block-aligned.
    
    The change does not affect the optimization for modify-write-read
    cases (38c73044 NFS: read-modify-write page updating),
    because partial update of !PageUptodate() pages can only happen
    in layouts that can do arbitrary length read/write and never
    in block-based ones.
    
    Testing results:
    
    We ran fio on one of the pNFS clients running 4.20 kernel
    (vanilla and patched) in this configuration to read/write/overwrite
    files on the storage array, exported as pnfs share by the server.
    
     pNFS clients ---1G Ethernet--- pNFS server
     (HP DL360 G8)                  (HP DL360 G8)
           |                              |
           |                              |
           +------8G Fiber Channel--------+
                         |
                   Storage Array
                     (HP P6350)
    
    Throughput of overwrite (both buffered and O_SYNC) is noticeably
    improved.
    
    Ops.     |block size|   Throughput   |
             |  (KiB)   |    (MiB/s)     |
             |          |  4.20 | patched|
    ---------+----------+----------------+
    buffered |         4|  21.3 |  232   |
    overwrite|        32|  22.2 |  256   |
             |       512|  22.4 |  260   |
    ---------+----------+----------------+
    O_SYNC   |         4|   3.84|    4.77|
    overwrite|        32|  12.2 |   32.0 |
             |       512|  18.5 |  152   |
    ---------+----------+----------------+
    
    Read and write (buffered and O_SYNC) by the same client remain unchanged
    by the patch either negatively or positively, as they should do.
    
    Ops.     |block size|   Throughput   |
             |  (KiB)   |    (MiB/s)     |
             |          |  4.20 | patched|
    ---------+----------+----------------+
    read     |         4| 548   |  550   |
             |        32| 547   |  551   |
             |       512| 548   |  551   |
    ---------+----------+----------------+
    buffered |         4| 237   |  244   |
    write    |        32| 261   |  268   |
             |       512| 265   |  272   |
    ---------+----------+----------------+
    O_SYNC   |         4|   0.46|    0.46|
    write    |        32|   3.60|    3.57|
             |       512| 105   |  106   |
    ---------+----------+----------------+
    Signed-off-by: default avatarKazuo Ito <ito_kazuo_g3@lab.ntt.co.jp>
    Tested-by: default avatarHiroyuki Watanabe <watanabe.hiroyuki@lab.ntt.co.jp>
    Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
    2cde04e9
file.c 22.3 KB