• luanshi's avatar
    NFS: readdirplus optimization by cache mechanism · be4c2d47
    luanshi authored
    When listing very large directories via NFS, clients may take a long
    time to complete. There are about three factors involved:
    
    First of all, ls and practically every other method of listing a
    directory including python os.listdir and find rely on libc readdir().
    However readdir() only reads 32K of directory entries at a time, which
    means that if you have a lot of files in the same directory, it is going
    to take an insanely long time to read all the directory entries.
    
    Secondly, libc readdir() reads 32K of directory entries at a time, in
    kernel space 32K buffer split into 8 pages. One NFS readdirplus rpc will
    be called for one page, which introduces many readdirplus rpc calls.
    
    Lastly, one NFS readdirplus rpc asks for 32K data (filled by nfs_dentry)
    to fill one page (filled by dentry), we found that nearly one third of
    data was wasted.
    
    To solve above problems, pagecache mechanism was introduced. One NFS
    readdirplus rpc will ask for a large data (more than 32k), the data can
    fill more than one page, the cached pages can be used for next readdir
    call. This can reduce many readdirplus rpc calls and improve readdirplus
    performance.
    
    TESTING:
    When listing very large directories(include 300 thousand files) via NFS
    
    time ls -l /nfs_mount | wc -l
    
    without the patch:
    300001
    real    1m53.524s
    user    0m2.314s
    sys     0m2.599s
    
    with the patch:
    300001
    real    0m23.487s
    user    0m2.305s
    sys     0m2.558s
    
    Improved performance: 79.6%
    readdirplus rpc calls decrease: 85%
    Signed-off-by: default avatarLiguang Zhang <zhangliguang@linux.alibaba.com>
    Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
    be4c2d47
internal.h 24.2 KB