1. 20 Oct, 2022 21 commits
    • Christian Brauner's avatar
      ovl: use posix acl api · 31acceb9
      Christian Brauner authored
      Now that posix acls have a proper api us it to copy them.
      
      All filesystems that can serve as lower or upper layers for overlayfs
      have gained support for the new posix acl api in previous patches.
      So switch all internal overlayfs codepaths for copying posix acls to the
      new posix acl api.
      Acked-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      31acceb9
    • Christian Brauner's avatar
      ovl: implement set acl method · 0e641857
      Christian Brauner authored
      The current way of setting and getting posix acls through the generic
      xattr interface is error prone and type unsafe. The vfs needs to
      interpret and fixup posix acls before storing or reporting it to
      userspace. Various hacks exist to make this work. The code is hard to
      understand and difficult to maintain in it's current form. Instead of
      making this work by hacking posix acls through xattr handlers we are
      building a dedicated posix acl api around the get and set inode
      operations. This removes a lot of hackiness and makes the codepaths
      easier to maintain. A lot of background can be found in [1].
      
      In order to build a type safe posix api around get and set acl we need
      all filesystem to implement get and set acl.
      
      Now that we have added get and set acl inode operations that allow easy
      access to the dentry we give overlayfs it's own get and set acl inode
      operations.
      
      The set acl inode operation is duplicates most of the ovl posix acl
      xattr handler. The main difference being that the set acl inode
      operation relies on the new posix acl api. Once the vfs has been
      switched over the custom posix acl xattr handler will be removed
      completely.
      
      Note, until the vfs has been switched to the new posix acl api this
      patch is a non-functional change.
      
      Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1]
      Acked-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      0e641857
    • Christian Brauner's avatar
      ovl: implement get acl method · 6c0a8bfb
      Christian Brauner authored
      The current way of setting and getting posix acls through the generic
      xattr interface is error prone and type unsafe. The vfs needs to
      interpret and fixup posix acls before storing or reporting it to
      userspace. Various hacks exist to make this work. The code is hard to
      understand and difficult to maintain in it's current form. Instead of
      making this work by hacking posix acls through xattr handlers we are
      building a dedicated posix acl api around the get and set inode
      operations. This removes a lot of hackiness and makes the codepaths
      easier to maintain. A lot of background can be found in [1].
      
      In order to build a type safe posix api around get and set acl we need
      all filesystem to implement get and set acl.
      
      Now that we have added get and set acl inode operations that allow easy
      access to the dentry we give overlayfs it's own get and set acl inode
      operations.
      
      Since overlayfs is a stacking filesystem it will use the newly added
      posix acl api when retrieving posix acls from the relevant layer.
      
      Since overlayfs can also be mounted on top of idmapped layers. If
      idmapped layers are used overlayfs must take the layer's idmapping into
      account after it retrieved the posix acls from the relevant layer.
      
      Note, until the vfs has been switched to the new posix acl api this
      patch is a non-functional change.
      
      Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1]
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      6c0a8bfb
    • Christian Brauner's avatar
      ecryptfs: implement set acl method · 86c261b9
      Christian Brauner authored
      The current way of setting and getting posix acls through the generic
      xattr interface is error prone and type unsafe. The vfs needs to
      interpret and fixup posix acls before storing or reporting it to
      userspace. Various hacks exist to make this work. The code is hard to
      understand and difficult to maintain in it's current form. Instead of
      making this work by hacking posix acls through xattr handlers we are
      building a dedicated posix acl api around the get and set inode
      operations. This removes a lot of hackiness and makes the codepaths
      easier to maintain. A lot of background can be found in [1].
      
      In order to build a type safe posix api around get and set acl we need
      all filesystem to implement get and set acl.
      
      So far ecryptfs didn't implement get and set acl inode operations
      because it wanted easy access to the dentry. Now that we extended the
      set acl inode operation to take a dentry argument and added a new get
      acl inode operation that takes a dentry argument we can let ecryptfs
      implement get and set acl inode operations.
      
      Note, until the vfs has been switched to the new posix acl api this
      patch is a non-functional change.
      
      Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1]
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      86c261b9
    • Christian Brauner's avatar
      ecryptfs: implement get acl method · af84016f
      Christian Brauner authored
      The current way of setting and getting posix acls through the generic
      xattr interface is error prone and type unsafe. The vfs needs to
      interpret and fixup posix acls before storing or reporting it to
      userspace. Various hacks exist to make this work. The code is hard to
      understand and difficult to maintain in it's current form. Instead of
      making this work by hacking posix acls through xattr handlers we are
      building a dedicated posix acl api around the get and set inode
      operations. This removes a lot of hackiness and makes the codepaths
      easier to maintain. A lot of background can be found in [1].
      
      In order to build a type safe posix api around get and set acl we need
      all filesystem to implement get and set acl.
      
      So far ecryptfs didn't implement get and set acl inode operations
      because it wanted easy access to the dentry. Now that we extended the
      set acl inode operation to take a dentry argument and added a new get
      acl inode operation that takes a dentry argument we can let ecryptfs
      implement get and set acl inode operations.
      
      Note, until the vfs has been switched to the new posix acl api this
      patch is a non-functional change.
      
      Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1]
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      af84016f
    • Christian Brauner's avatar
      ksmbd: use vfs_remove_acl() · b82784a2
      Christian Brauner authored
      The current way of setting and getting posix acls through the generic
      xattr interface is error prone and type unsafe. The vfs needs to
      interpret and fixup posix acls before storing or reporting it to
      userspace. Various hacks exist to make this work. The code is hard to
      understand and difficult to maintain in it's current form. Instead of
      making this work by hacking posix acls through xattr handlers we are
      building a dedicated posix acl api around the get and set inode
      operations. This removes a lot of hackiness and makes the codepaths
      easier to maintain. A lot of background can be found in [1].
      
      Now that we've switched all filesystems that can serve as the lower
      filesystem for ksmbd we can switch ksmbd over to rely on
      the posix acl api. Note that this is orthogonal to switching the vfs
      itself over.
      
      Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1]
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      b82784a2
    • Christian Brauner's avatar
      acl: add vfs_remove_acl() · aeb7f005
      Christian Brauner authored
      In previous patches we implemented get and set inode operations for all
      non-stacking filesystems that support posix acls but didn't yet
      implement get and/or set acl inode operations. This specifically
      affected cifs and 9p.
      
      Now we can build a posix acl api based solely on get and set inode
      operations. We add a new vfs_remove_acl() api that can be used to set
      posix acls. This finally removes all type unsafety and type conversion
      issues explained in detail in [1] that we aim to get rid of.
      
      After we finished building the vfs api we can switch stacking
      filesystems to rely on the new posix api and then finally switch the
      xattr system calls themselves to rely on the posix acl api.
      
      Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1]
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      aeb7f005
    • Christian Brauner's avatar
      acl: add vfs_get_acl() · 4f353ba4
      Christian Brauner authored
      In previous patches we implemented get and set inode operations for all
      non-stacking filesystems that support posix acls but didn't yet
      implement get and/or set acl inode operations. This specifically
      affected cifs and 9p.
      
      Now we can build a posix acl api based solely on get and set inode
      operations. We add a new vfs_get_acl() api that can be used to get posix
      acls. This finally removes all type unsafety and type conversion issues
      explained in detail in [1] that we aim to get rid of.
      
      After we finished building the vfs api we can switch stacking
      filesystems to rely on the new posix api and then finally switch the
      xattr system calls themselves to rely on the posix acl api.
      
      Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1]
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      4f353ba4
    • Christian Brauner's avatar
      acl: add vfs_set_acl() · e4cc9163
      Christian Brauner authored
      In previous patches we implemented get and set inode operations for all
      non-stacking filesystems that support posix acls but didn't yet
      implement get and/or set acl inode operations. This specifically
      affected cifs and 9p.
      
      Now we can build a posix acl api based solely on get and set inode
      operations. We add a new vfs_set_acl() api that can be used to set posix
      acls. This finally removes all type unsafety and type conversion issues
      explained in detail in [1] that we aim to get rid of.
      
      After we finished building the vfs api we can switch stacking
      filesystems to rely on the new posix api and then finally switch the
      xattr system calls themselves to rely on the posix acl api.
      
      Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1]
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      e4cc9163
    • Christian Brauner's avatar
      internal: add may_write_xattr() · 56851bc9
      Christian Brauner authored
      Split out the generic checks whether an inode allows writing xattrs. Since
      security.* and system.* xattrs don't have any restrictions and we're going
      to split out posix acls into a dedicated api we will use this helper to
      check whether we can write posix acls.
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      56851bc9
    • Christian Brauner's avatar
      evm: add post set acl hook · a56df5d5
      Christian Brauner authored
      The security_inode_post_setxattr() hook is used by security modules to
      update their own security.* xattrs. Consequently none of the security
      modules operate on posix acls. So we don't need an additional security
      hook when post setting posix acls.
      
      However, the integrity subsystem wants to be informed about posix acl
      changes in order to reset the EVM status flag.
      
      -> evm_inode_post_setxattr()
         -> evm_update_evmxattr()
            -> evm_calc_hmac()
               -> evm_calc_hmac_or_hash()
      
      and evm_cacl_hmac_or_hash() walks the global list of protected xattr
      names evm_config_xattrnames. This global list can be modified via
      /sys/security/integrity/evm/evm_xattrs. The write to "evm_xattrs" is
      restricted to security.* xattrs and the default xattrs in
      evm_config_xattrnames only contains security.* xattrs as well.
      
      So the actual value for posix acls is currently completely irrelevant
      for evm during evm_inode_post_setxattr() and frankly it should stay that
      way in the future to not cause the vfs any more headaches. But if the
      actual posix acl values matter then evm shouldn't operate on the binary
      void blob and try to hack around in the uapi struct anyway. Instead it
      should then in the future add a dedicated hook which takes a struct
      posix_acl argument passing the posix acls in the proper vfs format.
      
      For now it is sufficient to make evm_inode_post_set_acl() a wrapper
      around evm_inode_post_setxattr() not passing any actual values down.
      This will cause the hashes to be updated as before.
      Reviewed-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      a56df5d5
    • Christian Brauner's avatar
      integrity: implement get and set acl hook · e61b135f
      Christian Brauner authored
      The current way of setting and getting posix acls through the generic
      xattr interface is error prone and type unsafe. The vfs needs to
      interpret and fixup posix acls before storing or reporting it to
      userspace. Various hacks exist to make this work. The code is hard to
      understand and difficult to maintain in it's current form. Instead of
      making this work by hacking posix acls through xattr handlers we are
      building a dedicated posix acl api around the get and set inode
      operations. This removes a lot of hackiness and makes the codepaths
      easier to maintain. A lot of background can be found in [1].
      
      So far posix acls were passed as a void blob to the security and
      integrity modules. Some of them like evm then proceed to interpret the
      void pointer and convert it into the kernel internal struct posix acl
      representation to perform their integrity checking magic. This is
      obviously pretty problematic as that requires knowledge that only the
      vfs is guaranteed to have and has lead to various bugs. Add a proper
      security hook for setting posix acls and pass down the posix acls in
      their appropriate vfs format instead of hacking it through a void
      pointer stored in the uapi format.
      
      I spent considerate time in the security module and integrity
      infrastructure and audited all codepaths. EVM is the only part that
      really has restrictions based on the actual posix acl values passed
      through it (e.g., i_mode). Before this dedicated hook EVM used to translate
      from the uapi posix acl format sent to it in the form of a void pointer
      into the vfs format. This is not a good thing. Instead of hacking around in
      the uapi struct give EVM the posix acls in the appropriate vfs format and
      perform sane permissions checks that mirror what it used to to in the
      generic xattr hook.
      
      IMA doesn't have any restrictions on posix acls. When posix acls are
      changed it just wants to update its appraisal status to trigger an EVM
      revalidation.
      
      The removal of posix acls is equivalent to passing NULL to the posix set
      acl hooks. This is the same as before through the generic xattr api.
      
      Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1]
      Acked-by: Paul Moore <paul@paul-moore.com> (LSM)
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      e61b135f
    • Christian Brauner's avatar
      smack: implement get, set and remove acl hook · 44faac01
      Christian Brauner authored
      The current way of setting and getting posix acls through the generic
      xattr interface is error prone and type unsafe. The vfs needs to
      interpret and fixup posix acls before storing or reporting it to
      userspace. Various hacks exist to make this work. The code is hard to
      understand and difficult to maintain in it's current form. Instead of
      making this work by hacking posix acls through xattr handlers we are
      building a dedicated posix acl api around the get and set inode
      operations. This removes a lot of hackiness and makes the codepaths
      easier to maintain. A lot of background can be found in [1].
      
      So far posix acls were passed as a void blob to the security and
      integrity modules. Some of them like evm then proceed to interpret the
      void pointer and convert it into the kernel internal struct posix acl
      representation to perform their integrity checking magic. This is
      obviously pretty problematic as that requires knowledge that only the
      vfs is guaranteed to have and has lead to various bugs. Add a proper
      security hook for setting posix acls and pass down the posix acls in
      their appropriate vfs format instead of hacking it through a void
      pointer stored in the uapi format.
      
      I spent considerate time in the security module infrastructure and
      audited all codepaths. Smack has no restrictions based on the posix
      acl values passed through it. The capability hook doesn't need to be
      called either because it only has restrictions on security.* xattrs. So
      these all becomes very simple hooks for smack.
      
      Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1]
      Reviewed-by: default avatarCasey Schaufler <casey@schaufler-ca.com>
      Reviewed-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      44faac01
    • Christian Brauner's avatar
      selinux: implement get, set and remove acl hook · 1bdeb218
      Christian Brauner authored
      The current way of setting and getting posix acls through the generic
      xattr interface is error prone and type unsafe. The vfs needs to
      interpret and fixup posix acls before storing or reporting it to
      userspace. Various hacks exist to make this work. The code is hard to
      understand and difficult to maintain in it's current form. Instead of
      making this work by hacking posix acls through xattr handlers we are
      building a dedicated posix acl api around the get and set inode
      operations. This removes a lot of hackiness and makes the codepaths
      easier to maintain. A lot of background can be found in [1].
      
      So far posix acls were passed as a void blob to the security and
      integrity modules. Some of them like evm then proceed to interpret the
      void pointer and convert it into the kernel internal struct posix acl
      representation to perform their integrity checking magic. This is
      obviously pretty problematic as that requires knowledge that only the
      vfs is guaranteed to have and has lead to various bugs. Add a proper
      security hook for setting posix acls and pass down the posix acls in
      their appropriate vfs format instead of hacking it through a void
      pointer stored in the uapi format.
      
      I spent considerate time in the security module infrastructure and
      audited all codepaths. SELinux has no restrictions based on the posix
      acl values passed through it. The capability hook doesn't need to be
      called either because it only has restrictions on security.* xattrs. So
      these are all fairly simply hooks for SELinux.
      
      Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1]
      Acked-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      1bdeb218
    • Christian Brauner's avatar
      security: add get, remove and set acl hook · 72b3897e
      Christian Brauner authored
      The current way of setting and getting posix acls through the generic
      xattr interface is error prone and type unsafe. The vfs needs to
      interpret and fixup posix acls before storing or reporting it to
      userspace. Various hacks exist to make this work. The code is hard to
      understand and difficult to maintain in it's current form. Instead of
      making this work by hacking posix acls through xattr handlers we are
      building a dedicated posix acl api around the get and set inode
      operations. This removes a lot of hackiness and makes the codepaths
      easier to maintain. A lot of background can be found in [1].
      
      So far posix acls were passed as a void blob to the security and
      integrity modules. Some of them like evm then proceed to interpret the
      void pointer and convert it into the kernel internal struct posix acl
      representation to perform their integrity checking magic. This is
      obviously pretty problematic as that requires knowledge that only the
      vfs is guaranteed to have and has lead to various bugs. Add a proper
      security hook for setting posix acls and pass down the posix acls in
      their appropriate vfs format instead of hacking it through a void
      pointer stored in the uapi format.
      
      In the next patches we implement the hooks for the few security modules
      that do actually have restrictions on posix acls.
      
      Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1]
      Acked-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      72b3897e
    • Christian Brauner's avatar
      9p: implement set acl method · 079da629
      Christian Brauner authored
      The current way of setting and getting posix acls through the generic
      xattr interface is error prone and type unsafe. The vfs needs to
      interpret and fixup posix acls before storing or reporting it to
      userspace. Various hacks exist to make this work. The code is hard to
      understand and difficult to maintain in it's current form. Instead of
      making this work by hacking posix acls through xattr handlers we are
      building a dedicated posix acl api around the get and set inode
      operations. This removes a lot of hackiness and makes the codepaths
      easier to maintain. A lot of background can be found in [1].
      
      In order to build a type safe posix api around get and set acl we need
      all filesystem to implement get and set acl.
      
      So far 9p implemented a ->get_inode_acl() operation that didn't require
      access to the dentry in order to allow (limited) permission checking via
      posix acls in the vfs. Now that we have get and set acl inode operations
      that take a dentry argument we can give 9p get and set acl inode
      operations.
      
      This is mostly a light refactoring of the codepaths currently used in 9p
      posix acl xattr handler. After we have fully implemented the posix acl
      api and switched the vfs over to it, the 9p specific posix acl xattr
      handler and associated code will be removed.
      
      Note, until the vfs has been switched to the new posix acl api this
      patch is a non-functional change.
      
      Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1]
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      079da629
    • Christian Brauner's avatar
      9p: implement get acl method · 6cd4d4e8
      Christian Brauner authored
      The current way of setting and getting posix acls through the generic
      xattr interface is error prone and type unsafe. The vfs needs to
      interpret and fixup posix acls before storing or reporting it to
      userspace. Various hacks exist to make this work. The code is hard to
      understand and difficult to maintain in it's current form. Instead of
      making this work by hacking posix acls through xattr handlers we are
      building a dedicated posix acl api around the get and set inode
      operations. This removes a lot of hackiness and makes the codepaths
      easier to maintain. A lot of background can be found in [1].
      
      In order to build a type safe posix api around get and set acl we need
      all filesystem to implement get and set acl.
      
      So far 9p implemented a ->get_inode_acl() operation that didn't require
      access to the dentry in order to allow (limited) permission checking via
      posix acls in the vfs. Now that we have get and set acl inode operations
      that take a dentry argument we can give 9p get and set acl inode
      operations.
      
      This is mostly a refactoring of the codepaths currently used in 9p posix
      acl xattr handler. After we have fully implemented the posix acl api and
      switched the vfs over to it, the 9p specific posix acl xattr handler and
      associated code will be removed.
      
      Note, until the vfs has been switched to the new posix acl api this
      patch is a non-functional change.
      
      Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1]
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      6cd4d4e8
    • Christian Brauner's avatar
      cifs: implement set acl method · dc1af4c4
      Christian Brauner authored
      The current way of setting and getting posix acls through the generic
      xattr interface is error prone and type unsafe. The vfs needs to
      interpret and fixup posix acls before storing or reporting it to
      userspace. Various hacks exist to make this work. The code is hard to
      understand and difficult to maintain in it's current form. Instead of
      making this work by hacking posix acls through xattr handlers we are
      building a dedicated posix acl api around the get and set inode
      operations. This removes a lot of hackiness and makes the codepaths
      easier to maintain. A lot of background can be found in [1].
      
      In order to build a type safe posix api around get and set acl we need
      all filesystem to implement get and set acl.
      
      So far cifs wasn't able to implement get and set acl inode operations
      because it needs access to the dentry. Now that we extended the set acl
      inode operation to take a dentry argument and added a new get acl inode
      operation that takes a dentry argument we can let cifs implement get and
      set acl inode operations.
      
      This is mostly a copy and paste of the codepaths currently used in cifs'
      posix acl xattr handler. After we have fully implemented the posix acl
      api and switched the vfs over to it, the cifs specific posix acl xattr
      handler and associated code will be removed and the code duplication
      will go away.
      
      Note, until the vfs has been switched to the new posix acl api this
      patch is a non-functional change.
      
      Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1]
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      dc1af4c4
    • Christian Brauner's avatar
      cifs: implement get acl method · bd9684b0
      Christian Brauner authored
      The current way of setting and getting posix acls through the generic
      xattr interface is error prone and type unsafe. The vfs needs to
      interpret and fixup posix acls before storing or reporting it to
      userspace. Various hacks exist to make this work. The code is hard to
      understand and difficult to maintain in it's current form. Instead of
      making this work by hacking posix acls through xattr handlers we are
      building a dedicated posix acl api around the get and set inode
      operations. This removes a lot of hackiness and makes the codepaths
      easier to maintain. A lot of background can be found in [1].
      
      In order to build a type safe posix api around get and set acl we need
      all filesystem to implement get and set acl.
      
      So far cifs wasn't able to implement get and set acl inode operations
      because it needs access to the dentry. Now that we extended the set acl
      inode operation to take a dentry argument and added a new get acl inode
      operation that takes a dentry argument we can let cifs implement get and
      set acl inode operations.
      
      This is mostly a copy and paste of the codepaths currently used in cifs'
      posix acl xattr handler. After we have fully implemented the posix acl
      api and switched the vfs over to it, the cifs specific posix acl xattr
      handler and associated code will be removed and the code duplication
      will go away.
      
      Note, until the vfs has been switched to the new posix acl api this
      patch is a non-functional change.
      
      Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1]
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      bd9684b0
    • Christian Brauner's avatar
      fs: add new get acl method · 7420332a
      Christian Brauner authored
      The current way of setting and getting posix acls through the generic
      xattr interface is error prone and type unsafe. The vfs needs to
      interpret and fixup posix acls before storing or reporting it to
      userspace. Various hacks exist to make this work. The code is hard to
      understand and difficult to maintain in it's current form. Instead of
      making this work by hacking posix acls through xattr handlers we are
      building a dedicated posix acl api around the get and set inode
      operations. This removes a lot of hackiness and makes the codepaths
      easier to maintain. A lot of background can be found in [1].
      
      Since some filesystem rely on the dentry being available to them when
      setting posix acls (e.g., 9p and cifs) they cannot rely on the old get
      acl inode operation to retrieve posix acl and need to implement their
      own custom handlers because of that.
      
      In a previous patch we renamed the old get acl inode operation to
      ->get_inode_acl(). We decided to rename it and implement a new one since
      ->get_inode_acl() is called generic_permission() and inode_permission()
      both of which can be called during an filesystem's ->permission()
      handler. So simply passing a dentry argument to ->get_acl() would have
      amounted to also having to pass a dentry argument to ->permission(). We
      avoided that change.
      
      This adds a new ->get_acl() inode operations which takes a dentry
      argument which filesystems such as 9p, cifs, and overlayfs can implement
      to get posix acls.
      
      Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1]
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      7420332a
    • Christian Brauner's avatar
      fs: rename current get acl method · cac2f8b8
      Christian Brauner authored
      The current way of setting and getting posix acls through the generic
      xattr interface is error prone and type unsafe. The vfs needs to
      interpret and fixup posix acls before storing or reporting it to
      userspace. Various hacks exist to make this work. The code is hard to
      understand and difficult to maintain in it's current form. Instead of
      making this work by hacking posix acls through xattr handlers we are
      building a dedicated posix acl api around the get and set inode
      operations. This removes a lot of hackiness and makes the codepaths
      easier to maintain. A lot of background can be found in [1].
      
      The current inode operation for getting posix acls takes an inode
      argument but various filesystems (e.g., 9p, cifs, overlayfs) need access
      to the dentry. In contrast to the ->set_acl() inode operation we cannot
      simply extend ->get_acl() to take a dentry argument. The ->get_acl()
      inode operation is called from:
      
      acl_permission_check()
      -> check_acl()
         -> get_acl()
      
      which is part of generic_permission() which in turn is part of
      inode_permission(). Both generic_permission() and inode_permission() are
      called in the ->permission() handler of various filesystems (e.g.,
      overlayfs). So simply passing a dentry argument to ->get_acl() would
      amount to also having to pass a dentry argument to ->permission(). We
      should avoid this unnecessary change.
      
      So instead of extending the existing inode operation rename it from
      ->get_acl() to ->get_inode_acl() and add a ->get_acl() method later that
      passes a dentry argument and which filesystems that need access to the
      dentry can implement instead of ->get_inode_acl(). Filesystems like cifs
      which allow setting and getting posix acls but not using them for
      permission checking during lookup can simply not implement
      ->get_inode_acl().
      
      This is intended to be a non-functional change.
      
      Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1]
      Suggested-by/Inspired-by: Christoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      cac2f8b8
  2. 19 Oct, 2022 2 commits
    • Christian Brauner's avatar
      fs: pass dentry to set acl method · 138060ba
      Christian Brauner authored
      The current way of setting and getting posix acls through the generic
      xattr interface is error prone and type unsafe. The vfs needs to
      interpret and fixup posix acls before storing or reporting it to
      userspace. Various hacks exist to make this work. The code is hard to
      understand and difficult to maintain in it's current form. Instead of
      making this work by hacking posix acls through xattr handlers we are
      building a dedicated posix acl api around the get and set inode
      operations. This removes a lot of hackiness and makes the codepaths
      easier to maintain. A lot of background can be found in [1].
      
      Since some filesystem rely on the dentry being available to them when
      setting posix acls (e.g., 9p and cifs) they cannot rely on set acl inode
      operation. But since ->set_acl() is required in order to use the generic
      posix acl xattr handlers filesystems that do not implement this inode
      operation cannot use the handler and need to implement their own
      dedicated posix acl handlers.
      
      Update the ->set_acl() inode method to take a dentry argument. This
      allows all filesystems to rely on ->set_acl().
      
      As far as I can tell all codepaths can be switched to rely on the dentry
      instead of just the inode. Note that the original motivation for passing
      the dentry separate from the inode instead of just the dentry in the
      xattr handlers was because of security modules that call
      security_d_instantiate(). This hook is called during
      d_instantiate_new(), d_add(), __d_instantiate_anon(), and
      d_splice_alias() to initialize the inode's security context and possibly
      to set security.* xattrs. Since this only affects security.* xattrs this
      is completely irrelevant for posix acls.
      
      Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1]
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      138060ba
    • Christian Brauner's avatar
      orangefs: rework posix acl handling when creating new filesystem objects · 4053d250
      Christian Brauner authored
      When creating new filesytem objects orangefs used to create posix acls
      after it had created and inserted a new inode. This made it necessary to
      all posix_acl_chmod() on the newly created inode in case the mode of the
      inode would be changed by the posix acls.
      
      Instead of doing it this way calculate the correct mode directly before
      actually creating the inode. So we first create posix acls, then pass
      the mode that posix acls mandate into the orangefs getattr helper and
      calculate the correct mode. This is needed so we can simply change
      posix_acl_chmod() to take a dentry instead of an inode argument in the
      next patch.
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      4053d250
  3. 16 Oct, 2022 10 commits
    • Linus Torvalds's avatar
      Linux 6.1-rc1 · 9abf2313
      Linus Torvalds authored
      9abf2313
    • Linus Torvalds's avatar
      Merge tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random · f1947d7c
      Linus Torvalds authored
      Pull more random number generator updates from Jason Donenfeld:
       "This time with some large scale treewide cleanups.
      
        The intent of this pull is to clean up the way callers fetch random
        integers. The current rules for doing this right are:
      
         - If you want a secure or an insecure random u64, use get_random_u64()
      
         - If you want a secure or an insecure random u32, use get_random_u32()
      
           The old function prandom_u32() has been deprecated for a while
           now and is just a wrapper around get_random_u32(). Same for
           get_random_int().
      
         - If you want a secure or an insecure random u16, use get_random_u16()
      
         - If you want a secure or an insecure random u8, use get_random_u8()
      
         - If you want secure or insecure random bytes, use get_random_bytes().
      
           The old function prandom_bytes() has been deprecated for a while
           now and has long been a wrapper around get_random_bytes()
      
         - If you want a non-uniform random u32, u16, or u8 bounded by a
           certain open interval maximum, use prandom_u32_max()
      
           I say "non-uniform", because it doesn't do any rejection sampling
           or divisions. Hence, it stays within the prandom_*() namespace, not
           the get_random_*() namespace.
      
           I'm currently investigating a "uniform" function for 6.2. We'll see
           what comes of that.
      
        By applying these rules uniformly, we get several benefits:
      
         - By using prandom_u32_max() with an upper-bound that the compiler
           can prove at compile-time is ≤65536 or ≤256, internally
           get_random_u16() or get_random_u8() is used, which wastes fewer
           batched random bytes, and hence has higher throughput.
      
         - By using prandom_u32_max() instead of %, when the upper-bound is
           not a constant, division is still avoided, because
           prandom_u32_max() uses a faster multiplication-based trick instead.
      
         - By using get_random_u16() or get_random_u8() in cases where the
           return value is intended to indeed be a u16 or a u8, we waste fewer
           batched random bytes, and hence have higher throughput.
      
        This series was originally done by hand while I was on an airplane
        without Internet. Later, Kees and I worked on retroactively figuring
        out what could be done with Coccinelle and what had to be done
        manually, and then we split things up based on that.
      
        So while this touches a lot of files, the actual amount of code that's
        hand fiddled is comfortably small"
      
      * tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
        prandom: remove unused functions
        treewide: use get_random_bytes() when possible
        treewide: use get_random_u32() when possible
        treewide: use get_random_{u8,u16}() when possible, part 2
        treewide: use get_random_{u8,u16}() when possible, part 1
        treewide: use prandom_u32_max() when possible, part 2
        treewide: use prandom_u32_max() when possible, part 1
      f1947d7c
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-for-v6.1-2-2022-10-16' of... · 8636df94
      Linus Torvalds authored
      Merge tag 'perf-tools-for-v6.1-2-2022-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull more perf tools updates from Arnaldo Carvalho de Melo:
      
       - Use BPF CO-RE (Compile Once, Run Everywhere) to support old kernels
         when using bperf (perf BPF based counters) with cgroups.
      
       - Support HiSilicon PCIe Performance Monitoring Unit (PMU), that
         monitors bandwidth, latency, bus utilization and buffer occupancy.
      
         Documented in Documentation/admin-guide/perf/hisi-pcie-pmu.rst.
      
       - User space tasks can migrate between CPUs, so when tracing selected
         CPUs, system-wide sideband is still needed, fix it in the setup of
         Intel PT on hybrid systems.
      
       - Fix metricgroups title message in 'perf list', it should state that
         the metrics groups are to be used with the '-M' option, not '-e'.
      
       - Sync the msr-index.h copy with the kernel sources, adding support for
         using "AMD64_TSC_RATIO" in filter expressions in 'perf trace' as well
         as decoding it when printing the MSR tracepoint arguments.
      
       - Fix program header size and alignment when generating a JIT ELF in
         'perf inject'.
      
       - Add multiple new Intel PT 'perf test' entries, including a jitdump
         one.
      
       - Fix the 'perf test' entries for 'perf stat' CSV and JSON output when
         running on PowerPC due to an invalid topology number in that arch.
      
       - Fix the 'perf test' for arm_coresight failures on the ARM Juno
         system.
      
       - Fix the 'perf test' attr entry for PERF_FORMAT_LOST, adding this
         option to the or expression expected in the intercepted
         perf_event_open() syscall.
      
       - Add missing condition flags ('hs', 'lo', 'vc', 'vs') for arm64 in the
         'perf annotate' asm parser.
      
       - Fix 'perf mem record -C' option processing, it was being chopped up
         when preparing the underlying 'perf record -e mem-events' and thus
         being ignored, requiring using '-- -C CPUs' as a workaround.
      
       - Improvements and tidy ups for 'perf test' shell infra.
      
       - Fix Intel PT information printing segfault in uClibc, where a NULL
         format was being passed to fprintf.
      
      * tag 'perf-tools-for-v6.1-2-2022-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (23 commits)
        tools arch x86: Sync the msr-index.h copy with the kernel sources
        perf auxtrace arm64: Add support for parsing HiSilicon PCIe Trace packet
        perf auxtrace arm64: Add support for HiSilicon PCIe Tune and Trace device driver
        perf auxtrace arm: Refactor event list iteration in auxtrace_record__init()
        perf tests stat+json_output: Include sanity check for topology
        perf tests stat+csv_output: Include sanity check for topology
        perf intel-pt: Fix system_wide dummy event for hybrid
        perf intel-pt: Fix segfault in intel_pt_print_info() with uClibc
        perf test: Fix attr tests for PERF_FORMAT_LOST
        perf test: test_intel_pt.sh: Add 9 tests
        perf inject: Fix GEN_ELF_TEXT_OFFSET for jit
        perf test: test_intel_pt.sh: Add jitdump test
        perf test: test_intel_pt.sh: Tidy some alignment
        perf test: test_intel_pt.sh: Print a message when skipping kernel tracing
        perf test: test_intel_pt.sh: Tidy some perf record options
        perf test: test_intel_pt.sh: Fix return checking again
        perf: Skip and warn on unknown format 'configN' attrs
        perf list: Fix metricgroups title message
        perf mem: Fix -C option behavior for perf mem record
        perf annotate: Add missing condition flags for arm64
        ...
      8636df94
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v6.1' of... · 2df76606
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - Fix CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y compile error for the
         combination of Clang >= 14 and GAS <= 2.35.
      
       - Drop vmlinux.bz2 from the rpm package as it just annoyingly increased
         the package size.
      
       - Fix modpost error under build environments using musl.
      
       - Make *.ll files keep value names for easier debugging
      
       - Fix single directory build
      
       - Prevent RISC-V from selecting the broken DWARF5 support when Clang
         and GAS are used together.
      
      * tag 'kbuild-fixes-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        lib/Kconfig.debug: Add check for non-constant .{s,u}leb128 support to DWARF5
        kbuild: fix single directory build
        kbuild: add -fno-discard-value-names to cmd_cc_ll_c
        scripts/clang-tools: Convert clang-tidy args to list
        modpost: put modpost options before argument
        kbuild: Stop including vmlinux.bz2 in the rpm's
        Kconfig.debug: add toolchain checks for DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT
        Kconfig.debug: simplify the dependency of DEBUG_INFO_DWARF4/5
      2df76606
    • Linus Torvalds's avatar
      Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 2fcd8f10
      Linus Torvalds authored
      Pull more clk updates from Stephen Boyd:
       "This is the final part of the clk patches for this merge window.
      
        The clk rate range series needed another week to fully bake. Maxime
        fixed the bug that broke clk notifiers and prevented this from being
        included in the first pull request. He also added a unit test on top
        to make sure it doesn't break so easily again. The majority of the
        series fixes up how the clk_set_rate_*() APIs work, particularly
        around when the rate constraints are dropped and how they move around
        when reparenting clks. Overall it's a much needed improvement to the
        clk rate range APIs that used to be pretty broken if you looked
        sideways.
      
        Beyond the core changes there are a few driver fixes for a compilation
        issue or improper data causing clks to fail to register or have the
        wrong parents. These are good to get in before the first -rc so that
        the system actually boots on the affected devices"
      
      * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (31 commits)
        clk: tegra: Fix Tegra PWM parent clock
        clk: at91: fix the build with binutils 2.27
        clk: qcom: gcc-msm8660: Drop hardcoded fixed board clocks
        clk: mediatek: clk-mux: Add .determine_rate() callback
        clk: tests: Add tests for notifiers
        clk: Update req_rate on __clk_recalc_rates()
        clk: tests: Add missing test case for ranges
        clk: qcom: clk-rcg2: Take clock boundaries into consideration for gfx3d
        clk: Introduce the clk_hw_get_rate_range function
        clk: Zero the clk_rate_request structure
        clk: Stop forwarding clk_rate_requests to the parent
        clk: Constify clk_has_parent()
        clk: Introduce clk_core_has_parent()
        clk: Switch from __clk_determine_rate to clk_core_round_rate_nolock
        clk: Add our request boundaries in clk_core_init_rate_req
        clk: Introduce clk_hw_init_rate_request()
        clk: Move clk_core_init_rate_req() from clk_core_round_rate_nolock() to its caller
        clk: Change clk_core_init_rate_req prototype
        clk: Set req_rate on reparenting
        clk: Take into account uncached clocks in clk_set_rate_range()
        ...
      2fcd8f10
    • Linus Torvalds's avatar
      Merge tag '6.1-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6 · b08cd744
      Linus Torvalds authored
      Pull more cifs updates from Steve French:
      
       - fix a regression in guest mounts to old servers
      
       - improvements to directory leasing (caching directory entries safely
         beyond the root directory)
      
       - symlink improvement (reducing roundtrips needed to process symlinks)
      
       - an lseek fix (to problem where some dir entries could be skipped)
      
       - improved ioctl for returning more detailed information on directory
         change notifications
      
       - clarify multichannel interface query warning
      
       - cleanup fix (for better aligning buffers using ALIGN and round_up)
      
       - a compounding fix
      
       - fix some uninitialized variable bugs found by Coverity and the kernel
         test robot
      
      * tag '6.1-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
        smb3: improve SMB3 change notification support
        cifs: lease key is uninitialized in two additional functions when smb1
        cifs: lease key is uninitialized in smb1 paths
        smb3: must initialize two ACL struct fields to zero
        cifs: fix double-fault crash during ntlmssp
        cifs: fix static checker warning
        cifs: use ALIGN() and round_up() macros
        cifs: find and use the dentry for cached non-root directories also
        cifs: enable caching of directories for which a lease is held
        cifs: prevent copying past input buffer boundaries
        cifs: fix uninitialised var in smb2_compound_op()
        cifs: improve symlink handling for smb2+
        smb3: clarify multichannel warning
        cifs: fix regression in very old smb1 mounts
        cifs: fix skipping to incorrect offset in emit_cached_dirents
      b08cd744
    • Tetsuo Handa's avatar
      Revert "cpumask: fix checking valid cpu range". · 80493877
      Tetsuo Handa authored
      This reverts commit 78e5a339 ("cpumask: fix checking valid cpu range").
      
      syzbot is hitting WARN_ON_ONCE(cpu >= nr_cpumask_bits) warning at
      cpu_max_bits_warn() [1], for commit 78e5a339 ("cpumask: fix checking
      valid cpu range") is broken.  Obviously that patch hits WARN_ON_ONCE()
      when e.g.  reading /proc/cpuinfo because passing "cpu + 1" instead of
      "cpu" will trivially hit cpu == nr_cpumask_bits condition.
      
      Although syzbot found this problem in linux-next.git on 2022/09/27 [2],
      this problem was not fixed immediately.  As a result, that patch was
      sent to linux.git before the patch author recognizes this problem, and
      syzbot started failing to test changes in linux.git since 2022/10/10
      [3].
      
      Andrew Jones proposed a fix for x86 and riscv architectures [4].  But
      [2] and [5] indicate that affected locations are not limited to arch
      code.  More delay before we find and fix affected locations, less tested
      kernel (and more difficult to bisect and fix) before release.
      
      We should have inspected and fixed basically all cpumask users before
      applying that patch.  We should not crash kernels in order to ask
      existing cpumask users to update their code, even if limited to
      CONFIG_DEBUG_PER_CPU_MAPS=y case.
      
      Link: https://syzkaller.appspot.com/bug?extid=d0fd2bf0dd6da72496dd [1]
      Link: https://syzkaller.appspot.com/bug?extid=21da700f3c9f0bc40150 [2]
      Link: https://syzkaller.appspot.com/bug?extid=51a652e2d24d53e75734 [3]
      Link: https://lkml.kernel.org/r/20221014155845.1986223-1-ajones@ventanamicro.com [4]
      Link: https://syzkaller.appspot.com/bug?extid=4d46c43d81c3bd155060 [5]
      Reported-by: default avatarAndrew Jones <ajones@ventanamicro.com>
      Reported-by: syzbot+d0fd2bf0dd6da72496dd@syzkaller.appspotmail.com
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Yury Norov <yury.norov@gmail.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      80493877
    • Nathan Chancellor's avatar
      lib/Kconfig.debug: Add check for non-constant .{s,u}leb128 support to DWARF5 · 0a6de78c
      Nathan Chancellor authored
      When building with a RISC-V kernel with DWARF5 debug info using clang
      and the GNU assembler, several instances of the following error appear:
      
        /tmp/vgettimeofday-48aa35.s:2963: Error: non-constant .uleb128 is not supported
      
      Dumping the .s file reveals these .uleb128 directives come from
      .debug_loc and .debug_ranges:
      
        .Ldebug_loc0:
                .byte   4                               # DW_LLE_offset_pair
                .uleb128 .Lfunc_begin0-.Lfunc_begin0    #   starting offset
                .uleb128 .Ltmp1-.Lfunc_begin0           #   ending offset
                .byte   1                               # Loc expr size
                .byte   90                              # DW_OP_reg10
                .byte   0                               # DW_LLE_end_of_list
      
        .Ldebug_ranges0:
                .byte   4                               # DW_RLE_offset_pair
                .uleb128 .Ltmp6-.Lfunc_begin0           #   starting offset
                .uleb128 .Ltmp27-.Lfunc_begin0          #   ending offset
                .byte   4                               # DW_RLE_offset_pair
                .uleb128 .Ltmp28-.Lfunc_begin0          #   starting offset
                .uleb128 .Ltmp30-.Lfunc_begin0          #   ending offset
                .byte   0                               # DW_RLE_end_of_list
      
      There is an outstanding binutils issue to support a non-constant operand
      to .sleb128 and .uleb128 in GAS for RISC-V but there does not appear to
      be any movement on it, due to concerns over how it would work with
      linker relaxation.
      
      To avoid these build errors, prevent DWARF5 from being selected when
      using clang and an assembler that does not have support for these symbol
      deltas, which can be easily checked in Kconfig with as-instr plus the
      small test program from the dwz test suite from the binutils issue.
      
      Link: https://sourceware.org/bugzilla/show_bug.cgi?id=27215
      Link: https://github.com/ClangBuiltLinux/linux/issues/1719Signed-off-by: default avatarNathan Chancellor <nathan@kernel.org>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      0a6de78c
    • Masahiro Yamada's avatar
      kbuild: fix single directory build · 3753af77
      Masahiro Yamada authored
      Commit f110e5a2 ("kbuild: refactor single builds of *.ko") was wrong.
      
      KBUILD_MODULES _is_ needed for single builds.
      
      Otherwise, "make foo/bar/baz/" does not build module objects at all.
      
      Fixes: f110e5a2 ("kbuild: refactor single builds of *.ko")
      Reported-by: default avatarDavid Sterba <dsterba@suse.cz>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Tested-by: default avatarDavid Sterba <dsterba@suse.com>
      3753af77
    • Linus Torvalds's avatar
      Merge tag 'slab-for-6.1-rc1-hotfix' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab · 1501278b
      Linus Torvalds authored
      Pull slab hotfix from Vlastimil Babka:
       "A single fix for the common-kmalloc series, for warnings on mips and
        sparc64 reported by Guenter Roeck"
      
      * tag 'slab-for-6.1-rc1-hotfix' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
        mm/slab: use kmalloc_node() for off slab freelist_idx_t array allocation
      1501278b
  4. 15 Oct, 2022 7 commits