• Christian Brauner's avatar
    open: return EINVAL for O_DIRECTORY | O_CREAT · 43b45063
    Christian Brauner authored
    After a couple of years and multiple LTS releases we received a report
    that the behavior of O_DIRECTORY | O_CREAT changed starting with v5.7.
    
    On kernels prior to v5.7 combinations of O_DIRECTORY, O_CREAT, O_EXCL
    had the following semantics:
    
    (1) open("/tmp/d", O_DIRECTORY | O_CREAT)
        * d doesn't exist:                create regular file
        * d exists and is a regular file: ENOTDIR
        * d exists and is a directory:    EISDIR
    
    (2) open("/tmp/d", O_DIRECTORY | O_CREAT | O_EXCL)
        * d doesn't exist:                create regular file
        * d exists and is a regular file: EEXIST
        * d exists and is a directory:    EEXIST
    
    (3) open("/tmp/d", O_DIRECTORY | O_EXCL)
        * d doesn't exist:                ENOENT
        * d exists and is a regular file: ENOTDIR
        * d exists and is a directory:    open directory
    
    On kernels since to v5.7 combinations of O_DIRECTORY, O_CREAT, O_EXCL
    have the following semantics:
    
    (1) open("/tmp/d", O_DIRECTORY | O_CREAT)
        * d doesn't exist:                ENOTDIR (create regular file)
        * d exists and is a regular file: ENOTDIR
        * d exists and is a directory:    EISDIR
    
    (2) open("/tmp/d", O_DIRECTORY | O_CREAT | O_EXCL)
        * d doesn't exist:                ENOTDIR (create regular file)
        * d exists and is a regular file: EEXIST
        * d exists and is a directory:    EEXIST
    
    (3) open("/tmp/d", O_DIRECTORY | O_EXCL)
        * d doesn't exist:                ENOENT
        * d exists and is a regular file: ENOTDIR
        * d exists and is a directory:    open directory
    
    This is a fairly substantial semantic change that userspace didn't
    notice until Pedro took the time to deliberately figure out corner
    cases. Since no one noticed this breakage we can somewhat safely assume
    that O_DIRECTORY | O_CREAT combinations are likely unused.
    
    The v5.7 breakage is especially weird because while ENOTDIR is returned
    indicating failure a regular file is actually created. This doesn't make
    a lot of sense.
    
    Time was spent finding potential users of this combination. Searching on
    codesearch.debian.net showed that codebases often express semantical
    expectations about O_DIRECTORY | O_CREAT which are completely contrary
    to what our code has done and currently does.
    
    The expectation often is that this particular combination would create
    and open a directory. This suggests users who tried to use that
    combination would stumble upon the counterintuitive behavior no matter
    if pre-v5.7 or post v5.7 and quickly realize neither semantics give them
    what they want. For some examples see the code examples in [1] to [3]
    and the discussion in [4].
    
    There are various ways to address this issue. The lazy/simple option
    would be to restore the pre-v5.7 behavior and to just live with that bug
    forever. But since there's a real chance that the O_DIRECTORY | O_CREAT
    quirk isn't relied upon we should try to get away with murder(ing bad
    semantics) first. If we need to Frankenstein pre-v5.7 behavior later so
    be it.
    
    So let's simply return EINVAL categorically for O_DIRECTORY | O_CREAT
    combinations. In addition to cleaning up the old bug this also opens up
    the possiblity to make that flag combination do something more intuitive
    in the future.
    
    Starting with this commit the following semantics apply:
    
    (1) open("/tmp/d", O_DIRECTORY | O_CREAT)
        * d doesn't exist:                EINVAL
        * d exists and is a regular file: EINVAL
        * d exists and is a directory:    EINVAL
    
    (2) open("/tmp/d", O_DIRECTORY | O_CREAT | O_EXCL)
        * d doesn't exist:                EINVAL
        * d exists and is a regular file: EINVAL
        * d exists and is a directory:    EINVAL
    
    (3) open("/tmp/d", O_DIRECTORY | O_EXCL)
        * d doesn't exist:                ENOENT
        * d exists and is a regular file: ENOTDIR
        * d exists and is a directory:    open directory
    
    One additional note, O_TMPFILE is implemented as:
    
        #define __O_TMPFILE    020000000
        #define O_TMPFILE      (__O_TMPFILE | O_DIRECTORY)
        #define O_TMPFILE_MASK (__O_TMPFILE | O_DIRECTORY | O_CREAT)
    
    For older kernels it was important to return an explicit error when
    O_TMPFILE wasn't supported. So O_TMPFILE requires that O_DIRECTORY is
    raised alongside __O_TMPFILE. It also enforced that O_CREAT wasn't
    specified. Since O_DIRECTORY | O_CREAT could be used to create a regular
    allowing that combination together with __O_TMPFILE would've meant that
    false positives were possible, i.e., that a regular file was created
    instead of a O_TMPFILE. This could've been used to trick userspace into
    thinking it operated on a O_TMPFILE when it wasn't.
    
    Now that we block O_DIRECTORY | O_CREAT completely the check for O_CREAT
    in the __O_TMPFILE branch via if ((flags & O_TMPFILE_MASK) != O_TMPFILE)
    can be dropped. Instead we can simply check verify that O_DIRECTORY is
    raised via if (!(flags & O_DIRECTORY)) and explain this in two comments.
    
    As Aleksa pointed out O_PATH is unaffected by this change since it
    always returned EINVAL if O_CREAT was specified - with or without
    O_DIRECTORY.
    
    Link: https://lore.kernel.org/lkml/20230320071442.172228-1-pedro.falcato@gmail.com
    Link: https://sources.debian.org/src/flatpak/1.14.4-1/subprojects/libglnx/glnx-dirfd.c/?hl=324#L324 [1]
    Link: https://sources.debian.org/src/flatpak-builder/1.2.3-1/subprojects/libglnx/glnx-shutil.c/?hl=251#L251 [2]
    Link: https://sources.debian.org/src/ostree/2022.7-2/libglnx/glnx-dirfd.c/?hl=324#L324 [3]
    Link: https://www.openwall.com/lists/oss-security/2014/11/26/14 [4]
    Reported-by: default avatarPedro Falcato <pedro.falcato@gmail.com>
    Cc: Aleksa Sarai <cyphar@cyphar.com>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
    43b45063
fcntl.h 5.46 KB