• Linus Torvalds's avatar
    pipe: make pipe writes always wake up readers · 3a34b13a
    Linus Torvalds authored
    Since commit 1b6b26ae ("pipe: fix and clarify pipe write wakeup
    logic") we have sanitized the pipe write logic, and would only try to
    wake up readers if they needed it.
    
    In particular, if the pipe already had data in it before the write,
    there was no point in trying to wake up a reader, since any existing
    readers must have been aware of the pre-existing data already.  Doing
    extraneous wakeups will only cause potential thundering herd problems.
    
    However, it turns out that some Android libraries have misused the EPOLL
    interface, and expected "edge triggered" be to "any new write will
    trigger it".  Even if there was no edge in sight.
    
    Quoting Sandeep Patil:
     "The commit 1b6b26ae ('pipe: fix and clarify pipe write wakeup
      logic') changed pipe write logic to wakeup readers only if the pipe
      was empty at the time of write. However, there are libraries that
      relied upon the older behavior for notification scheme similar to
      what's described in [1]
    
      One such library 'realm-core'[2] is used by numerous Android
      applications. The library uses a similar notification mechanism as GNU
      Make but it never drains the pipe until it is full. When Android moved
      to v5.10 kernel, all applications using this library stopped working.
    
      The library has since been fixed[3] but it will be a while before all
      applications incorporate the updated library"
    
    Our regression rule for the kernel is that if applications break from
    new behavior, it's a regression, even if it was because the application
    did something patently wrong.  Also note the original report [4] by
    Michal Kerrisk about a test for this epoll behavior - but at that point
    we didn't know of any actual broken use case.
    
    So add the extraneous wakeup, to approximate the old behavior.
    
    [ I say "approximate", because the exact old behavior was to do a wakeup
      not for each write(), but for each pipe buffer chunk that was filled
      in. The behavior introduced by this change is not that - this is just
      "every write will cause a wakeup, whether necessary or not", which
      seems to be sufficient for the broken library use. ]
    
    It's worth noting that this adds the extraneous wakeup only for the
    write side, while the read side still considers the "edge" to be purely
    about reading enough from the pipe to allow further writes.
    
    See commit f467a6a6 ("pipe: fix and clarify pipe read wakeup logic")
    for the pipe read case, which remains that "only wake up if the pipe was
    full, and we read something from it".
    
    Link: https://lore.kernel.org/lkml/CAHk-=wjeG0q1vgzu4iJhW5juPkTsjTYmiqiMUYAebWW+0bam6w@mail.gmail.com/ [1]
    Link: https://github.com/realm/realm-core [2]
    Link: https://github.com/realm/realm-core/issues/4666 [3]
    Link: https://lore.kernel.org/lkml/CAKgNAkjMBGeAwF=2MKK758BhxvW58wYTgYKB2V-gY1PwXxrH+Q@mail.gmail.com/ [4]
    Link: https://lore.kernel.org/lkml/20210729222635.2937453-1-sspatil@android.com/Reported-by: default avatarSandeep Patil <sspatil@android.com>
    Cc: Michael Kerrisk <mtk.manpages@gmail.com>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    3a34b13a
pipe.c 33.7 KB