• Mel Gorman's avatar
    fs: Do not check if there is a fsnotify watcher on pseudo inodes · e9c15bad
    Mel Gorman authored
    The kernel uses internal mounts created by kern_mount() and populated
    with files with no lookup path by alloc_file_pseudo() for a variety of
    reasons. An example of such a mount is for anonymous pipes. For pipes,
    every vfs_write() regardless of filesystem, calls fsnotify_modify()
    to notify of any changes which incurs a small amount of overhead in
    fsnotify even when there are no watchers. It can also trigger for reads
    and readv and writev, it was simply vfs_write() that was noticed first.
    
    A patch is pending that reduces, but does not eliminate, the overhead of
    fsnotify but for files that cannot be looked up via a path, even that
    small overhead is unnecessary. The user API for all notification
    subsystems (inotify, fanotify, ...) is based on the pathname and a dirfd
    and proc entries appear to be the only visible representation of the
    files. Proc does not have the same pathname as the internal entry and
    the proc inode is not the same as the internal inode so even if fanotify
    is used on a file under /proc/XX/fd, no useful events are notified.
    
    This patch changes alloc_file_pseudo() to always opt out of fsnotify by
    setting FMODE_NONOTIFY flag so that no check is made for fsnotify
    watchers on pseudo files. This should be safe as the underlying helper
    for the dentry is d_alloc_pseudo() which explicitly states that no
    lookups are ever performed meaning that fanotify should have nothing
    useful to attach to.
    
    The test motivating this was "perf bench sched messaging --pipe". On
    a single-socket machine using threads the difference of the patch was
    as follows.
    
                                  5.7.0                  5.7.0
                                vanilla        nofsnotify-v1r1
    Amean     1       1.3837 (   0.00%)      1.3547 (   2.10%)
    Amean     3       3.7360 (   0.00%)      3.6543 (   2.19%)
    Amean     5       5.8130 (   0.00%)      5.7233 *   1.54%*
    Amean     7       8.1490 (   0.00%)      7.9730 *   2.16%*
    Amean     12     14.6843 (   0.00%)     14.1820 (   3.42%)
    Amean     18     21.8840 (   0.00%)     21.7460 (   0.63%)
    Amean     24     28.8697 (   0.00%)     29.1680 (  -1.03%)
    Amean     30     36.0787 (   0.00%)     35.2640 *   2.26%*
    Amean     32     38.0527 (   0.00%)     38.1223 (  -0.18%)
    
    The difference is small but in some cases it's outside the noise so
    while marginal, there is still some small benefit to ignoring fsnotify
    for files allocated via alloc_file_pseudo() in some cases.
    
    Link: https://lore.kernel.org/r/20200615121358.GF3183@techsingularity.netSigned-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
    Reviewed-by: default avatarAmir Goldstein <amir73il@gmail.com>
    Signed-off-by: default avatarJan Kara <jack@suse.cz>
    e9c15bad
file_table.c 10.3 KB