-
Hidetoshi Seto authored
NPTL has 3 control counters (total/wake/woken). so NPTL can know: "how many threads enter to wait"(total), "how many threads receive wake signal"(wake), and "how many threads exit waiting"(woken). Abstraction of pthread_cond_wait and pthread_cond_signal are: A01 pthread_cond_wait { A02 timeout = 0; A03 lock(counters); A04 total++; A05 val = get_from(futex); A06 unlock(counters); A07 A08 sys_futex(futex, FUTEX_WAIT, val, timeout); A09 A10 lock(counters); A11 woken++; A12 unlock(counters); A13 } B01 pthread_cond_signal { B02 lock(counters); B03 if(total>wake) { /* if there is waiter */ B04 wake++; B05 update_val(futex); B06 sys_futex(futex, FUTEX_WAKE, 1); B07 } B08 unlock(counters); B09 } What we have to notice is: FUTEX_WAKE could be called before FUTEX_WAIT have called (at A07). In such case, FUTEX_WAKE will fail if there is no thread in waitqueue. However, since pthread_cond_signal do not only wake++ but also update_val(futex), next FUTEX_WAIT will fail with -EWOULDBLOCK because the val passed to WAIT is now not equal to updated val. Therefore, as the result, it seems that the WAKE wakes the WAIT. === The bug will appear if 2 pair of wait & wake called at (nearly)once: * Assume 4 threads, wait_A, wait_B, wake_X, and wake_Y * counters start from [total/wake/woken]=[0/0/0] * the val of futex starts from (0), update means inclement of the val. * there is no thread in waitqueue on the futex. [simulation] wait_A: calls pthread_cond_wait: total++, prepare to call FUTEX_WAIT with val=0. # status: [1/0/0] (0) queue={}(empty) # wake_X: calls pthread_cond_signal: no one in waitqueue, just wake++ and update futex val. # status: [1/1/0] (1) queue={}(empty) # wait_B: calls pthread_cond_wait: total++, prepare to call FUTEX_WAIT with val=1. # status: [2/1/0] (1) queue={}(empty) # wait_A: calls FUTEX_WAIT with val=0: after queueing, compare val. 0!=1 ... this should be blocked... # status: [2/1/0] (1) queue={A} # wait_B: calls FUTEX_WAIT with val=1: after queueing, compare val. 1==1 ... OK, let's schedule()... # status: [2/1/0] (1) queue={A,B} (B=sleeping) # wake_Y: calls pthread_cond_signal: A is in waitqueue ... dequeue A, wake++ and update futex val. # status: [2/2/0] (2) queue={B} (B=sleeping) # wait_A: end of FUTEX_WAIT with val=0: try to dequeue but already dequeued, return anyway. # status: [2/2/0] (2) queue={B} (B=sleeping) # wait_A: end of pthread_cond_wait: woken++. # status: [2/2/1] (2) queue={B} (B=sleeping) # This is bug: wait_A: wakeup wait_B: sleeping wake_X: wake A wake_Y: wake A again if subsequent wake_Z try to wake B: wake_Z: calls pthread_cond_signal: since total==wake, do nothing. # status: [2/2/1] (2) queue={B} (B=sleeping) # If wait_C comes, B become to can be woken, but C... This bug makes the waitqueue to trap some threads in it all time. ==== > - According to man of futex: > "If the futex was not equal to the expected value, the operation > returns -EWOULDBLOCK." > but now, here is no description about the rare case: > "returns 0 if the futex was not equal to the expected value, but > the process was woken by a FUTEX_WAKE call." > this behavior on rare case causes the hang which I found. So to avoid this problem, my patch shut up the window that you said: > The patch certainly looks sensible - I can see that without the patch, > there is a window in which this process is pointlessly queued up on the > futex and that in this window a wakeup attempt might do a bad thing. ===== In short: There is an un-documented behavior of futex_wait. This behavior misleads NPTL to wake a thread doubly, as the result, causes an application hang. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
4ee9f919