• Jason Baron's avatar
    epoll: optimize EPOLL_CTL_DEL using rcu · 81ff0d3b
    Jason Baron authored
    commit ae10b2b4 upstream.
    
    Nathan Zimmer found that once we get over 10+ cpus, the scalability of
    SPECjbb falls over due to the contention on the global 'epmutex', which is
    taken in on EPOLL_CTL_ADD and EPOLL_CTL_DEL operations.
    
    Patch #1 removes the 'epmutex' lock completely from the EPOLL_CTL_DEL path
    by using rcu to guard against any concurrent traversals.
    
    Patch #2 remove the 'epmutex' lock from EPOLL_CTL_ADD operations for
    simple topologies.  IE when adding a link from an epoll file descriptor to
    a wakeup source, where the epoll file descriptor is not nested.
    
    This patch (of 2):
    
    Optimize EPOLL_CTL_DEL such that it does not require the 'epmutex' by
    converting the file->f_ep_links list into an rcu one.  In this way, we can
    traverse the epoll network on the add path in parallel with deletes.
    Since deletes can't create loops or worse wakeup paths, this is safe.
    
    This patch in combination with the patch "epoll: Do not take global 'epmutex'
    for simple topologies", shows a dramatic performance improvement in
    scalability for SPECjbb.
    Signed-off-by: default avatarJason Baron <jbaron@akamai.com>
    Tested-by: default avatarNathan Zimmer <nzimmer@sgi.com>
    Cc: Eric Wong <normalperson@yhbt.net>
    Cc: Nelson Elhage <nelhage@nelhage.com>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Davide Libenzi <davidel@xmailserver.org>
    Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
    CC: Wu Fengguang <fengguang.wu@intel.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
    81ff0d3b
eventpoll.c 57.8 KB