• Hidetoshi Seto's avatar
    [PATCH] futex_wait hang fix · 4ee9f919
    Hidetoshi Seto authored
    NPTL has 3 control counters (total/wake/woken).
    so NPTL can know:
      "how many threads enter to wait"(total),
      "how many threads receive wake signal"(wake),
      and "how many threads exit waiting"(woken).
    
    Abstraction of pthread_cond_wait and pthread_cond_signal are:
    
    A01 pthread_cond_wait {
    A02   timeout = 0;
    A03   lock(counters);
    A04     total++;
    A05     val = get_from(futex);
    A06   unlock(counters);
    A07
    A08   sys_futex(futex, FUTEX_WAIT, val, timeout);
    A09
    A10   lock(counters);
    A11     woken++;
    A12   unlock(counters);
    A13 }
    
    B01 pthread_cond_signal {
    B02   lock(counters);
    B03   if(total>wake) { /* if there is waiter */
    B04     wake++;
    B05     update_val(futex);
    B06     sys_futex(futex, FUTEX_WAKE, 1);
    B07   }
    B08   unlock(counters);
    B09 }
    
    What we have to notice is:
        FUTEX_WAKE could be called before FUTEX_WAIT have called (at A07).
    In such case, FUTEX_WAKE will fail if there is no thread in waitqueue.
    
    However, since pthread_cond_signal do not only wake++ but also
    update_val(futex), next FUTEX_WAIT will fail with -EWOULDBLOCK because the val
    passed to WAIT is now not equal to updated val.  Therefore, as the result, it
    seems that the WAKE wakes the WAIT.
    
    ===
    
    The bug will appear if 2 pair of wait & wake called at (nearly)once:
    
       * Assume 4 threads, wait_A, wait_B, wake_X, and wake_Y
       * counters start from [total/wake/woken]=[0/0/0]
       * the val of futex starts from (0), update means inclement of the val.
       * there is no thread in waitqueue on the futex.
    
    [simulation]
    
    wait_A: calls pthread_cond_wait:
        total++, prepare to call FUTEX_WAIT with val=0.
        # status: [1/0/0] (0) queue={}(empty) #
    
    wake_X: calls pthread_cond_signal:
        no one in waitqueue, just wake++ and update futex val.
        # status: [1/1/0] (1) queue={}(empty) #
    
    wait_B: calls pthread_cond_wait:
        total++, prepare to call FUTEX_WAIT with val=1.
        # status: [2/1/0] (1) queue={}(empty) #
    
    wait_A: calls FUTEX_WAIT with val=0:
        after queueing, compare val. 0!=1 ... this should be blocked...
        # status: [2/1/0] (1) queue={A} #
    
    wait_B: calls FUTEX_WAIT with val=1:
        after queueing, compare val. 1==1 ... OK, let's schedule()...
        # status: [2/1/0] (1) queue={A,B} (B=sleeping) #
    
    wake_Y: calls pthread_cond_signal:
        A is in waitqueue ... dequeue A, wake++ and update futex val.
        # status: [2/2/0] (2) queue={B} (B=sleeping) #
    
    wait_A: end of FUTEX_WAIT with val=0:
        try to dequeue but already dequeued, return anyway.
        # status: [2/2/0] (2) queue={B} (B=sleeping) #
    
    wait_A: end of pthread_cond_wait:
        woken++.
        # status: [2/2/1] (2) queue={B} (B=sleeping) #
    
    This is bug:
       wait_A: wakeup
       wait_B: sleeping
       wake_X: wake A
       wake_Y: wake A again
    
    if subsequent wake_Z try to wake B:
    
    wake_Z: calls pthread_cond_signal:
        since total==wake, do nothing.
        # status: [2/2/1] (2) queue={B} (B=sleeping) #
    
    If wait_C comes, B become to can be woken, but C...
    
    This bug makes the waitqueue to trap some threads in it all time.
    
    ====
    
    >  - According to man of futex:
    >      "If the futex was not equal to the expected value, the operation
    >       returns -EWOULDBLOCK."
    >    but now, here is no description about the rare case:
    >      "returns 0 if the futex was not equal to the expected value, but
    >       the process was woken by a FUTEX_WAKE call."
    >    this behavior on rare case causes the hang which I found.
    
    So to avoid this problem, my patch shut up the window that you said:
    
     > The patch certainly looks sensible - I can see that without the patch,
     > there is a window in which this process is pointlessly queued up on the
     > futex and that in this window a wakeup attempt might do a bad thing.
    
    =====
    
    In short:
    There is an un-documented behavior of futex_wait. This behavior misleads
    NPTL to wake a thread doubly, as the result, causes an application hang.
    Signed-off-by: default avatarHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
    Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
    4ee9f919
futex.c 17.9 KB