Commit a80a6b85 authored by Andrew Morton's avatar Andrew Morton Committed by Linus Torvalds

revert "epoll: support for disabling items, and a self-test app"

Revert commit 03a7beb5 ("epoll: support for disabling items, and a
self-test app") pending resolution of the issues identified by Michael
Kerrisk, copied below.

We'll revisit this for 3.8.

: I've taken a look at this patch as it currently stands in 3.7-rc1, and
: done a bit of testing. (By the way, the test program
: tools/testing/selftests/epoll/test_epoll.c does not compile...)
:
: There are one or two places where the behavior seems a little strange,
: so I have a question or two at the end of this mail. But other than
: that, I want to check my understanding so that the interface can be
: correctly documented.
:
: Just to go though my understanding, the problem is the following
: scenario in a multithreaded application:
:
: 1. Multiple threads are performing epoll_wait() operations,
:    and maintaining a user-space cache that contains information
:    corresponding to each file descriptor being monitored by
:    epoll_wait().
:
: 2. At some point, a thread wants to delete (EPOLL_CTL_DEL)
:    a file descriptor from the epoll interest list, and
:    delete the corresponding record from the user-space cache.
:
: 3. The problem with (2) is that some other thread may have
:    previously done an epoll_wait() that retrieved information
:    about the fd in question, and may be in the middle of using
:    information in the cache that relates to that fd. Thus,
:    there is a potential race.
:
: 4. The race can't solved purely in user space, because doing
:    so would require applying a mutex across the epoll_wait()
:    call, which would of course blow thread concurrency.
:
: Right?
:
: Your solution is the EPOLL_CTL_DISABLE operation. I want to
: confirm my understanding about how to use this flag, since
: the description that has accompanied the patches so far
: has been a bit sparse
:
: 0. In the scenario you're concerned about, deleting a file
:    descriptor means (safely) doing the following:
:    (a) Deleting the file descriptor from the epoll interest list
:        using EPOLL_CTL_DEL
:    (b) Deleting the corresponding record in the user-space cache
:
: 1. It's only meaningful to use this EPOLL_CTL_DISABLE in
:    conjunction with EPOLLONESHOT.
:
: 2. Using EPOLL_CTL_DISABLE without using EPOLLONESHOT in
:    conjunction is a logical error.
:
: 3. The correct way to code multithreaded applications using
:    EPOLL_CTL_DISABLE and EPOLLONESHOT is as follows:
:
:    a. All EPOLL_CTL_ADD and EPOLL_CTL_MOD operations should
:       should EPOLLONESHOT.
:
:    b. When a thread wants to delete a file descriptor, it
:       should do the following:
:
:       [1] Call epoll_ctl(EPOLL_CTL_DISABLE)
:       [2] If the return status from epoll_ctl(EPOLL_CTL_DISABLE)
:           was zero, then the file descriptor can be safely
:           deleted by the thread that made this call.
:       [3] If the epoll_ctl(EPOLL_CTL_DISABLE) fails with EBUSY,
:           then the descriptor is in use. In this case, the calling
:           thread should set a flag in the user-space cache to
:           indicate that the thread that is using the descriptor
:           should perform the deletion operation.
:
: Is all of the above correct?
:
: The implementation depends on checking on whether
: (events & ~EP_PRIVATE_BITS) == 0
: This replies on the fact that EPOLL_CTL_AD and EPOLL_CTL_MOD always
: set EPOLLHUP and EPOLLERR in the 'events' mask, and EPOLLONESHOT
: causes those flags (as well as all others in ~EP_PRIVATE_BITS) to be
: cleared.
:
: A corollary to the previous paragraph is that using EPOLL_CTL_DISABLE
: is only useful in conjunction with EPOLLONESHOT. However, as things
: stand, one can use EPOLL_CTL_DISABLE on a file descriptor that does
: not have EPOLLONESHOT set in 'events' This results in the following
: (slightly surprising) behavior:
:
: (a) The first call to epoll_ctl(EPOLL_CTL_DISABLE) returns 0
:     (the indicator that the file descriptor can be safely deleted).
: (b) The next call to epoll_ctl(EPOLL_CTL_DISABLE) fails with EBUSY.
:
: This doesn't seem particularly useful, and in fact is probably an
: indication that the user made a logic error: they should only be using
: epoll_ctl(EPOLL_CTL_DISABLE) on a file descriptor for which
: EPOLLONESHOT was set in 'events'. If that is correct, then would it
: not make sense to return an error to user space for this case?

Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: "Paton J. Lewis" <palewis@adobe.com>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent c24f9f19
...@@ -346,7 +346,7 @@ static inline struct epitem *ep_item_from_epqueue(poll_table *p) ...@@ -346,7 +346,7 @@ static inline struct epitem *ep_item_from_epqueue(poll_table *p)
/* Tells if the epoll_ctl(2) operation needs an event copy from userspace */ /* Tells if the epoll_ctl(2) operation needs an event copy from userspace */
static inline int ep_op_has_event(int op) static inline int ep_op_has_event(int op)
{ {
return op == EPOLL_CTL_ADD || op == EPOLL_CTL_MOD; return op != EPOLL_CTL_DEL;
} }
/* Initialize the poll safe wake up structure */ /* Initialize the poll safe wake up structure */
...@@ -676,34 +676,6 @@ static int ep_remove(struct eventpoll *ep, struct epitem *epi) ...@@ -676,34 +676,6 @@ static int ep_remove(struct eventpoll *ep, struct epitem *epi)
return 0; return 0;
} }
/*
* Disables a "struct epitem" in the eventpoll set. Returns -EBUSY if the item
* had no event flags set, indicating that another thread may be currently
* handling that item's events (in the case that EPOLLONESHOT was being
* used). Otherwise a zero result indicates that the item has been disabled
* from receiving events. A disabled item may be re-enabled via
* EPOLL_CTL_MOD. Must be called with "mtx" held.
*/
static int ep_disable(struct eventpoll *ep, struct epitem *epi)
{
int result = 0;
unsigned long flags;
spin_lock_irqsave(&ep->lock, flags);
if (epi->event.events & ~EP_PRIVATE_BITS) {
if (ep_is_linked(&epi->rdllink))
list_del_init(&epi->rdllink);
/* Ensure ep_poll_callback will not add epi back onto ready
list: */
epi->event.events &= EP_PRIVATE_BITS;
}
else
result = -EBUSY;
spin_unlock_irqrestore(&ep->lock, flags);
return result;
}
static void ep_free(struct eventpoll *ep) static void ep_free(struct eventpoll *ep)
{ {
struct rb_node *rbp; struct rb_node *rbp;
...@@ -1048,6 +1020,8 @@ static void ep_rbtree_insert(struct eventpoll *ep, struct epitem *epi) ...@@ -1048,6 +1020,8 @@ static void ep_rbtree_insert(struct eventpoll *ep, struct epitem *epi)
rb_insert_color(&epi->rbn, &ep->rbr); rb_insert_color(&epi->rbn, &ep->rbr);
} }
#define PATH_ARR_SIZE 5 #define PATH_ARR_SIZE 5
/* /*
* These are the number paths of length 1 to 5, that we are allowing to emanate * These are the number paths of length 1 to 5, that we are allowing to emanate
...@@ -1813,12 +1787,6 @@ SYSCALL_DEFINE4(epoll_ctl, int, epfd, int, op, int, fd, ...@@ -1813,12 +1787,6 @@ SYSCALL_DEFINE4(epoll_ctl, int, epfd, int, op, int, fd,
} else } else
error = -ENOENT; error = -ENOENT;
break; break;
case EPOLL_CTL_DISABLE:
if (epi)
error = ep_disable(ep, epi);
else
error = -ENOENT;
break;
} }
mutex_unlock(&ep->mtx); mutex_unlock(&ep->mtx);
......
...@@ -25,7 +25,6 @@ ...@@ -25,7 +25,6 @@
#define EPOLL_CTL_ADD 1 #define EPOLL_CTL_ADD 1
#define EPOLL_CTL_DEL 2 #define EPOLL_CTL_DEL 2
#define EPOLL_CTL_MOD 3 #define EPOLL_CTL_MOD 3
#define EPOLL_CTL_DISABLE 4
/* /*
* Request the handling of system wakeup events so as to prevent system suspends * Request the handling of system wakeup events so as to prevent system suspends
......
TARGETS = breakpoints kcmp mqueue vm cpu-hotplug memory-hotplug epoll TARGETS = breakpoints kcmp mqueue vm cpu-hotplug memory-hotplug
all: all:
for TARGET in $(TARGETS); do \ for TARGET in $(TARGETS); do \
......
# Makefile for epoll selftests
all: test_epoll
%: %.c
gcc -pthread -g -o $@ $^
run_tests: all
./test_epoll
clean:
$(RM) test_epoll
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment