1. 22 Aug, 2022 1 commit
    • Christian Brauner's avatar
      ntfs: fix acl handling · 0c3bc789
      Christian Brauner authored
      While looking at our current POSIX ACL handling in the context of some
      overlayfs work I went through a range of other filesystems checking how they
      handle them currently and encountered ntfs3.
      
      The posic_acl_{from,to}_xattr() helpers always need to operate on the
      filesystem idmapping. Since ntfs3 can only be mounted in the initial user
      namespace the relevant idmapping is init_user_ns.
      
      The posix_acl_{from,to}_xattr() helpers are concerned with translating between
      the kernel internal struct posix_acl{_entry} and the uapi struct
      posix_acl_xattr_{header,entry} and the kernel internal data structure is cached
      filesystem wide.
      
      Additional idmappings such as the caller's idmapping or the mount's idmapping
      are handled higher up in the VFS. Individual filesystems usually do not need to
      concern themselves with these.
      
      The posix_acl_valid() helper is concerned with checking whether the values in
      the kernel internal struct posix_acl can be represented in the filesystem's
      idmapping. IOW, if they can be written to disk. So this helper too needs to
      take the filesystem's idmapping.
      
      Fixes: be71b5cb ("fs/ntfs3: Add attrib operations")
      Cc: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
      Cc: ntfs3@lists.linux.dev
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      0c3bc789
  2. 17 Aug, 2022 3 commits
    • Seth Forshee's avatar
      fs: require CAP_SYS_ADMIN in target namespace for idmapped mounts · bf1ac16e
      Seth Forshee authored
      Idmapped mounts should not allow a user to map file ownsership into a
      range of ids which is not under the control of that user. However, we
      currently don't check whether the mounter is privileged wrt to the
      target user namespace.
      
      Currently no FS_USERNS_MOUNT filesystems support idmapped mounts, thus
      this is not a problem as only CAP_SYS_ADMIN in init_user_ns is allowed
      to set up idmapped mounts. But this could change in the future, so add a
      check to refuse to create idmapped mounts when the mounter does not have
      CAP_SYS_ADMIN in the target user namespace.
      
      Fixes: bd303368 ("fs: support mapped mounts of mapped filesystems")
      Signed-off-by: default avatarSeth Forshee <sforshee@digitalocean.com>
      Reviewed-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      Link: https://lore.kernel.org/r/20220816164752.2595240-1-sforshee@digitalocean.comSigned-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      bf1ac16e
    • Christian Brauner's avatar
      MAINTAINERS: update idmapping tree · ddc84c90
      Christian Brauner authored
      Since Seth joined as a maintainer in ba40a57f ("Add Seth Forshee as
      co-maintainer for idmapped mounts") it was best to get a shared git tree
      instead of using our personal repositories. So we requested and
      Konstantin suggested and gave us a new "idmapping" repository under the
      pre-existing but mainly unused vfs namespace. Just makes it easier for
      Seth to send fixes in case I'm out or someone else ever takes over.
      
      Cc: Seth Forshee <sforshee@digitalocean.com>
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      Link: https://lore.kernel.org/r/20220816113514.43304-2-brauner@kernel.org
      ddc84c90
    • Christian Brauner's avatar
      acl: handle idmapped mounts for idmapped filesystems · abfcf55d
      Christian Brauner authored
      Ensure that POSIX ACLs checking, getting, and setting works correctly
      for filesystems mountable with a filesystem idmapping ("fs_idmapping")
      that want to support idmapped mounts ("mnt_idmapping").
      
      Note that no filesystems mountable with an fs_idmapping do yet support
      idmapped mounts. This is required infrastructure work to unblock this.
      
      As we explained in detail in [1] the fs_idmapping is irrelevant for
      getxattr() and setxattr() when mapping the ACL_{GROUP,USER} {g,u}ids
      stored in the uapi struct posix_acl_xattr_entry in
      posix_acl_fix_xattr_{from,to}_user().
      
      But for acl_permission_check() and posix_acl_{g,s}etxattr_idmapped_mnt()
      the fs_idmapping matters.
      
      acl_permission_check():
        During lookup POSIX ACLs are retrieved directly via i_op->get_acl() and
        are returned via the kernel internal struct posix_acl which contains
        e_{g,u}id members of type k{g,u}id_t that already take the
        fs_idmapping into acccount.
      
        For example, a POSIX ACL stored with u4 on the backing store is mapped
        to k10000004 in the fs_idmapping. The mnt_idmapping remaps the POSIX ACL
        to k20000004. In order to do that the fs_idmapping needs to be taken
        into account but that doesn't happen yet (Again, this is a
        counterfactual currently as fuse doesn't support idmapped mounts
        currently. It's just used as a convenient example.):
      
        fs_idmapping:  u0:k10000000:r65536
        mnt_idmapping: u0:v20000000:r65536
        ACL_USER:      k10000004
      
        acl_permission_check()
        -> check_acl()
           -> get_acl()
              -> i_op->get_acl() == fuse_get_acl()
                 -> posix_acl_from_xattr(u0:k10000000:r65536 /* fs_idmapping */, ...)
                    {
                            k10000004 = make_kuid(u0:k10000000:r65536 /* fs_idmapping */,
                                                  u4 /* ACL_USER */);
                    }
           -> posix_acl_permission()
              {
                      -1 = make_vfsuid(u0:v20000000:r65536 /* mnt_idmapping */,
                                       &init_user_ns,
                                       k10000004);
                      vfsuid_eq_kuid(-1, k10000004 /* caller_fsuid */)
              }
      
        In order to correctly map from the fs_idmapping into mnt_idmapping we
        require the relevant fs_idmaping to be passed:
      
        acl_permission_check()
        -> check_acl()
           -> get_acl()
              -> i_op->get_acl() == fuse_get_acl()
                 -> posix_acl_from_xattr(u0:k10000000:r65536 /* fs_idmapping */, ...)
                    {
                            k10000004 = make_kuid(u0:k10000000:r65536 /* fs_idmapping */,
                                                  u4 /* ACL_USER */);
                    }
           -> posix_acl_permission()
              {
                      v20000004 = make_vfsuid(u0:v20000000:r65536 /* mnt_idmapping */,
                                              u0:k10000000:r65536 /* fs_idmapping */,
                                              k10000004);
                      vfsuid_eq_kuid(v20000004, k10000004 /* caller_fsuid */)
              }
      
        The initial_idmapping is only correct for the current situation because
        all filesystems that currently support idmapped mounts do not support
        being mounted with an fs_idmapping.
      
        Note that ovl_get_acl() is used to retrieve the POSIX ACLs from the
        relevant lower layer and the lower layer's mnt_idmapping needs to be
        taken into account and so does the fs_idmapping. See 0c5fd887 ("acl:
        move idmapped mount fixup into vfs_{g,s}etxattr()") for more details.
      
      For posix_acl_{g,s}etxattr_idmapped_mnt() it is not as obvious why the
      fs_idmapping matters as it is for acl_permission_check(). Especially
      because it doesn't matter for posix_acl_fix_xattr_{from,to}_user() (See
      [1] for more context.).
      
      Because posix_acl_{g,s}etxattr_idmapped_mnt() operate on the uapi
      struct posix_acl_xattr_entry which contains {g,u}id_t values and thus
      give the impression that the fs_idmapping is irrelevant as at this point
      appropriate {g,u}id_t values have seemlingly been generated.
      
      As we've stated multiple times this assumption is wrong and in fact the
      uapi struct posix_acl_xattr_entry is taking idmappings into account
      depending at what place it is operated on.
      
      posix_acl_getxattr_idmapped_mnt()
        When posix_acl_getxattr_idmapped_mnt() is called the values stored in
        the uapi struct posix_acl_xattr_entry are mapped according to the
        fs_idmapping. This happened when they were read from the backing store
        and then translated from struct posix_acl into the uapi
        struct posix_acl_xattr_entry during posix_acl_to_xattr().
      
        In other words, the fs_idmapping matters as the values stored as
        {g,u}id_t in the uapi struct posix_acl_xattr_entry have been generated
        by it.
      
        So we need to take the fs_idmapping into account during make_vfsuid()
        in posix_acl_getxattr_idmapped_mnt().
      
      posix_acl_setxattr_idmapped_mnt()
        When posix_acl_setxattr_idmapped_mnt() is called the values stored as
        {g,u}id_t in uapi struct posix_acl_xattr_entry are intended to be the
        values that ultimately get turned back into a k{g,u}id_t in
        posix_acl_from_xattr() (which turns the uapi
        struct posix_acl_xattr_entry into the kernel internal struct posix_acl).
      
        In other words, the fs_idmapping matters as the values stored as
        {g,u}id_t in the uapi struct posix_acl_xattr_entry are intended to be
        the values that will be undone in the fs_idmapping when writing to the
        backing store.
      
        So we need to take the fs_idmapping into account during from_vfsuid()
        in posix_acl_setxattr_idmapped_mnt().
      
      Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1]
      Fixes: 0c5fd887 ("acl: move idmapped mount fixup into vfs_{g,s}etxattr()")
      Cc: Seth Forshee <sforshee@digitalocean.com>
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      Reviewed-by: default avatarSeth Forshee <sforshee@digitalocean.com>
      Link: https://lore.kernel.org/r/20220816113514.43304-1-brauner@kernel.org
      abfcf55d
  3. 14 Aug, 2022 10 commits
    • Linus Torvalds's avatar
      Linux 6.0-rc1 · 568035b0
      Linus Torvalds authored
      568035b0
    • Yury Norov's avatar
      radix-tree: replace gfp.h inclusion with gfp_types.h · 9f162193
      Yury Norov authored
      Radix tree header includes gfp.h for __GFP_BITS_SHIFT only. Now we
      have gfp_types.h for this.
      
      Fixes powerpc allmodconfig build:
      
         In file included from include/linux/nodemask.h:97,
                          from include/linux/mmzone.h:17,
                          from include/linux/gfp.h:7,
                          from include/linux/radix-tree.h:12,
                          from include/linux/idr.h:15,
                          from include/linux/kernfs.h:12,
                          from include/linux/sysfs.h:16,
                          from include/linux/kobject.h:20,
                          from include/linux/pci.h:35,
                          from arch/powerpc/kernel/prom_init.c:24:
         include/linux/random.h: In function 'add_latent_entropy':
      >> include/linux/random.h:25:46: error: 'latent_entropy' undeclared (first use in this function); did you mean 'add_latent_entropy'?
            25 |         add_device_randomness((const void *)&latent_entropy, sizeof(latent_entropy));
               |                                              ^~~~~~~~~~~~~~
               |                                              add_latent_entropy
         include/linux/random.h:25:46: note: each undeclared identifier is reported only once for each function it appears in
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      CC: Andrew Morton <akpm@linux-foundation.org>
      CC: Jason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarYury Norov <yury.norov@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9f162193
    • Linus Torvalds's avatar
      Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 74cbb480
      Linus Torvalds authored
      Pull vfs lseek fix from Al Viro:
       "Fix proc_reg_llseek() breakage. Always had been possible if somebody
        left NULL ->proc_lseek, became a practical issue now"
      
      * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        take care to handle NULL ->proc_lseek()
      74cbb480
    • Al Viro's avatar
      take care to handle NULL ->proc_lseek() · 3f61631d
      Al Viro authored
      Easily done now, just by clearing FMODE_LSEEK in ->f_mode
      during proc_reg_open() for such entries.
      
      Fixes: 868941b1 "fs: remove no_llseek"
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      3f61631d
    • Linus Torvalds's avatar
      Merge tag 'for-linus-6.0-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 5d6a0f4d
      Linus Torvalds authored
      Pull more xen updates from Juergen Gross:
      
       - fix the handling of the "persistent grants" feature negotiation
         between Xen blkfront and Xen blkback drivers
      
       - a cleanup of xen.config and adding xen.config to Xen section in
         MAINTAINERS
      
       - support HVMOP_set_evtchn_upcall_vector, which is more compliant to
         "normal" interrupt handling than the global callback used up to now
      
       - further small cleanups
      
      * tag 'for-linus-6.0-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        MAINTAINERS: add xen config fragments to XEN HYPERVISOR sections
        xen: remove XEN_SCRUB_PAGES in xen.config
        xen/pciback: Fix comment typo
        xen/xenbus: fix return type in xenbus_file_read()
        xen-blkfront: Apply 'feature_persistent' parameter when connect
        xen-blkback: Apply 'feature_persistent' parameter when connect
        xen-blkback: fix persistent grants negotiation
        x86/xen: Add support for HVMOP_set_evtchn_upcall_vector
      5d6a0f4d
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-fixes-for-v6.0-2022-08-13' of... · 96f86ff0
      Linus Torvalds authored
      Merge tag 'perf-tools-fixes-for-v6.0-2022-08-13' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull more perf tool updates from Arnaldo Carvalho de Melo:
      
       - 'perf c2c' now supports ARM64, adjust its output to cope with
         differences with what is in x86_64. Now go find false sharing on
         ARM64 (at least Neoverse) as well!
      
       - Refactor the JSON processing, making the output more compact and thus
         reducing the size of the resulting perf binary
      
       - Improvements for 'perf offcpu' profiling, including tracking child
         processes
      
       - Update Intel JSON metrics and events files for broadwellde,
         broadwellx, cascadelakex, haswellx, icelakex, ivytown, jaketown,
         knightslanding, sapphirerapids, skylakex and snowridgex
      
       - Add 'perf stat' JSON output and a 'perf test' entry for it
      
       - Ignore memfd and anonymous mmap events if jitdump present
      
       - Refactor 'perf test' shell tests allowing subdirs
      
       - Fix an error handling path in 'parse_perf_probe_command()'
      
       - Fixes for the guest Intel PT tracing patchkit in the 1st batch of
         this merge window
      
       - Print debuginfod queries if -v option is used, to explain delays in
         processing when debuginfo servers are enabled to fetch DSOs with
         richer symbol tables
      
       - Improve error message for 'perf record -p not_existing_pid'
      
       - Fix openssl and libbpf feature detection
      
       - Add PMU pai_crypto event description for IBM z16 on 'perf list'
      
       - Fix typos and duplicated words on comments in various places
      
      * tag 'perf-tools-fixes-for-v6.0-2022-08-13' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (81 commits)
        perf test: Refactor shell tests allowing subdirs
        perf vendor events: Update events for snowridgex
        perf vendor events: Update events and metrics for skylakex
        perf vendor events: Update metrics for sapphirerapids
        perf vendor events: Update events for knightslanding
        perf vendor events: Update metrics for jaketown
        perf vendor events: Update metrics for ivytown
        perf vendor events: Update events and metrics for icelakex
        perf vendor events: Update events and metrics for haswellx
        perf vendor events: Update events and metrics for cascadelakex
        perf vendor events: Update events and metrics for broadwellx
        perf vendor events: Update metrics for broadwellde
        perf jevents: Fold strings optimization
        perf jevents: Compress the pmu_events_table
        perf metrics: Copy entire pmu_event in find metric
        perf pmu-events: Hide the pmu_events
        perf pmu-events: Don't assume pmu_event is an array
        perf pmu-events: Move test events/metrics to JSON
        perf test: Use full metric resolution
        perf pmu-events: Hide pmu_events_map
        ...
      96f86ff0
    • Linus Torvalds's avatar
      Merge tag 'powerpc-6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · d785610f
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
      
       - Ensure we never emit lwarx with EH=1 on 32-bit, because some 32-bit
         CPUs trap on it rather than ignoring it as they should.
      
       - Fix ftrace when building with clang, which was broken by some
         refactoring.
      
       - A couple of other minor fixes.
      
      Thanks to Christophe Leroy, Naveen N.  Rao, Nick Desaulniers, Ondrej
      Mosnacek, Pali Rohár, Russell Currey, and Segher Boessenkool.
      
      * tag 'powerpc-6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/kexec: Fix build failure from uninitialised variable
        powerpc/ppc-opcode: Fix PPC_RAW_TW()
        powerpc64/ftrace: Fix ftrace for clang builds
        powerpc: Make eh value more explicit when using lwarx
        powerpc: Don't hide eh field of lwarx behind a macro
        powerpc: Fix eh field when calling lwarx on PPC32
      d785610f
    • Linus Torvalds's avatar
      Merge tag 'pull-work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · aea23e7c
      Linus Torvalds authored
      Pull /proc/mounts fix from Al Viro:
       "Fix for /proc/mounts escaping - escape the '#' character too"
      
      * tag 'pull-work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        vfs: escape hash as well
      aea23e7c
    • Linus Torvalds's avatar
      Merge tag '5.20-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6 · 332019e2
      Linus Torvalds authored
      Pull more cifs updates from Steve French:
      
       - two fixes for stable, one for a lock length miscalculation, and
         another fixes a lease break timeout bug
      
       - improvement to handle leases, allows the close timeout to be
         configured more safely
      
       - five restructuring/cleanup patches
      
      * tag '5.20-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: Do not access tcon->cfids->cfid directly from is_path_accessible
        cifs: Add constructor/destructors for tcon->cfid
        SMB3: fix lease break timeout when multiple deferred close handles for the same file.
        smb3: allow deferred close timeout to be configurable
        cifs: Do not use tcon->cfid directly, use the cfid we get from open_cached_dir
        cifs: Move cached-dir functions into a separate file
        cifs: Remove {cifs,nfs}_fscache_release_page()
        cifs: fix lock length calculation
      332019e2
    • David Howells's avatar
      afs: Enable multipage folio support · 8549a263
      David Howells authored
      Enable multipage folio support for the afs filesystem.
      
      Support has already been implemented in netfslib, fscache and cachefiles
      and in most of afs, but I've waited for Matthew Wilcox's latest folio
      changes.
      
      Note that it does require a change to afs_write_begin() to return the
      correct subpage.  This is a "temporary" change as we're working on
      getting rid of the need for ->write_begin() and ->write_end()
      completely, at least as far as network filesystems are concerned - but
      it doesn't prevent afs from making use of the capability.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Tested-by: kafs-testing@auristor.com
      Cc: Marc Dionne <marc.dionne@auristor.com>
      Cc: linux-afs@lists.infradead.org
      Link: https://lore.kernel.org/lkml/2274528.1645833226@warthog.procyon.org.uk/Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8549a263
  4. 13 Aug, 2022 26 commits