• Linus Torvalds's avatar
    Merge tag 'vfs-6.8.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 8c9440fe
    Linus Torvalds authored
    Pull vfs mount updates from Christian Brauner:
     "This contains the work to retrieve detailed information about mounts
      via two new system calls. This is hopefully the beginning of the end
      of the saga that started with fsinfo() years ago.
    
      The LWN articles in [1] and [2] can serve as a summary so we can avoid
      rehashing everything here.
    
      At LSFMM in May 2022 we got into a room and agreed on what we want to
      do about fsinfo(). Basically, split it into pieces. This is the first
      part of that agreement. Specifically, it is concerned with retrieving
      information about mounts. So this only concerns the mount information
      retrieval, not the mount table change notification, or the extended
      filesystem specific mount option work. That is separate work.
    
      Currently mounts have a 32bit id. Mount ids are already in heavy use
      by libmount and other low-level userspace but they can't be relied
      upon because they're recycled very quickly. We agreed that mounts
      should carry a unique 64bit id by which they can be referenced
      directly. This is now implemented as part of this work.
    
      The new 64bit mount id is exposed in statx() through the new
      STATX_MNT_ID_UNIQUE flag. If the flag isn't raised the old mount id is
      returned. If it is raised and the kernel supports the new 64bit mount
      id the flag is raised in the result mask and the new 64bit mount id is
      returned. New and old mount ids do not overlap so they cannot be
      conflated.
    
      Two new system calls are introduced that operate on the 64bit mount
      id: statmount() and listmount(). A summary of the api and usage can be
      found on LWN as well (cf. [3]) but of course, I'll provide a summary
      here as well.
    
      Both system calls rely on struct mnt_id_req. Which is the request
      struct used to pass the 64bit mount id identifying the mount to
      operate on. It is extensible to allow for the addition of new
      parameters and for future use in other apis that make use of mount
      ids.
    
      statmount() mimicks the semantics of statx() and exposes a set flags
      that userspace may raise in mnt_id_req to request specific information
      to be retrieved. A statmount() call returns a struct statmount filled
      in with information about the requested mount. Supported requests are
      indicated by raising the request flag passed in struct mnt_id_req in
      the @mask argument in struct statmount.
    
      Currently we do support:
    
       - STATMOUNT_SB_BASIC:
         Basic filesystem info
    
       - STATMOUNT_MNT_BASIC
         Mount information (mount id, parent mount id, mount attributes etc)
    
       - STATMOUNT_PROPAGATE_FROM
         Propagation from what mount in current namespace
    
       - STATMOUNT_MNT_ROOT
         Path of the root of the mount (e.g., mount --bind /bla /mnt returns /bla)
    
       - STATMOUNT_MNT_POINT
         Path of the mount point (e.g., mount --bind /bla /mnt returns /mnt)
    
       - STATMOUNT_FS_TYPE
         Name of the filesystem type as the magic number isn't enough due to submounts
    
      The string options STATMOUNT_MNT_{ROOT,POINT} and STATMOUNT_FS_TYPE
      are appended to the end of the struct. Userspace can use the offsets
      in @fs_type, @mnt_root, and @mnt_point to reference those strings
      easily.
    
      The struct statmount reserves quite a bit of space currently for
      future extensibility. This isn't really a problem and if this bothers
      us we can just send a follow-up pull request during this cycle.
    
      listmount() is given a 64bit mount id via mnt_id_req just as
      statmount(). It takes a buffer and a size to return an array of the
      64bit ids of the child mounts of the requested mount. Userspace can
      thus choose to either retrieve child mounts for a mount in batches or
      iterate through the child mounts. For most use-cases it will be
      sufficient to just leave space for a few child mounts. But for big
      mount tables having an iterator is really helpful. Iterating through a
      mount table works by setting @param in mnt_id_req to the mount id of
      the last child mount retrieved in the previous listmount() call"
    
    Link: https://lwn.net/Articles/934469 [1]
    Link: https://lwn.net/Articles/829212 [2]
    Link: https://lwn.net/Articles/950569 [3]
    
    * tag 'vfs-6.8.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
      add selftest for statmount/listmount
      fs: keep struct mnt_id_req extensible
      wire up syscalls for statmount/listmount
      add listmount(2) syscall
      statmount: simplify string option retrieval
      statmount: simplify numeric option retrieval
      add statmount(2) syscall
      namespace: extract show_path() helper
      mounts: keep list of mounts in an rbtree
      add unique mount ID
    8c9440fe
stat.c 23.2 KB