1. 28 Dec, 2023 22 commits
    • David Howells's avatar
      netfs: Implement a write-through caching option · 41d8e767
      David Howells authored
      Provide a flag whereby a filesystem may request that cifs_perform_write()
      perform write-through caching.  This involves putting pages directly into
      writeback rather than dirty and attaching them to a write operation as we
      go.
      
      Further, the writes being made are limited to the byte range being written
      rather than whole folios being written.  This can be used by cifs, for
      example, to deal with strict byte-range locking.
      
      This can't be used with content encryption as that may require expansion of
      the write RPC beyond the write being made.
      
      This doesn't affect writes via mmap - those are written back in the normal
      way; similarly failed writethrough writes are marked dirty and left to
      writeback to retry.  Another option would be to simply invalidate them, but
      the contents can be simultaneously accessed by read() and through mmap.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Jeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      41d8e767
    • David Howells's avatar
      netfs: Provide a launder_folio implementation · 4a79616c
      David Howells authored
      Provide a launder_folio implementation for netfslib.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      4a79616c
    • David Howells's avatar
      netfs: Provide a writepages implementation · 62c3b748
      David Howells authored
      Provide an implementation of writepages for network filesystems to delegate
      to.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      62c3b748
    • David Howells's avatar
      netfs, cachefiles: Pass upper bound length to allow expansion · e0ace6ca
      David Howells authored
      Make netfslib pass the maximum length to the ->prepare_write() op to tell
      the cache how much it can expand the length of a write to.  This allows a
      write to the server at the end of a file to be limited to a few bytes
      whilst writing an entire block to the cache (something required by direct
      I/O).
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      e0ace6ca
    • David Howells's avatar
      netfs: Provide netfs_file_read_iter() · 80645bd4
      David Howells authored
      Provide a top-level-ish function that can be pointed to directly by
      ->read_iter file op.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      80645bd4
    • David Howells's avatar
      netfs: Allow buffered shared-writeable mmap through netfs_page_mkwrite() · 102a7e2c
      David Howells authored
      Provide an entry point to delegate a filesystem's ->page_mkwrite() to.
      This checks for conflicting writes, then attached any netfs-specific group
      marking (e.g. ceph snap) to the page to be considered dirty.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      102a7e2c
    • David Howells's avatar
      netfs: Implement buffered write API · 938e13a7
      David Howells authored
      Institute a netfs write helper, netfs_file_write_iter(), to be pointed at
      by the network filesystem ->write_iter() call.  Make it handled buffered
      writes by calling the previously defined netfs_perform_write() to copy the
      source data into the pagecache.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      938e13a7
    • David Howells's avatar
      netfs: Implement unbuffered/DIO write support · 153a9961
      David Howells authored
      Implement support for unbuffered writes and direct I/O writes.  If the
      write is misaligned with respect to the fscrypt block size, then RMW cycles
      are performed if necessary.  DIO writes are a special case of unbuffered
      writes with extra restriction imposed, such as block size alignment
      requirements.
      
      Also provide a field that can tell the code to add some extra space onto
      the bounce buffer for use by the filesystem in the case of a
      content-encrypted file.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      153a9961
    • David Howells's avatar
      netfs: Implement unbuffered/DIO read support · 016dc851
      David Howells authored
      Implement support for unbuffered and DIO reads in the netfs library,
      utilising the existing read helper code to do block splitting and
      individual queuing.  The code also handles extraction of the destination
      buffer from the supplied iterator, allowing async unbuffered reads to take
      place.
      
      The read will be split up according to the rsize setting and, if supplied,
      the ->clamp_length() method.  Note that the next subrequest will be issued
      as soon as issue_op returns, without waiting for previous ones to finish.
      The network filesystem needs to pause or handle queuing them if it doesn't
      want to fire them all at the server simultaneously.
      
      Once all the subrequests have finished, the state will be assessed and the
      amount of data to be indicated as having being obtained will be
      determined.  As the subrequests may finish in any order, if an intermediate
      subrequest is short, any further subrequests may be copied into the buffer
      and then abandoned.
      
      In the future, this will also take care of doing an unbuffered read from
      encrypted content, with the decryption being done by the library.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Jeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      016dc851
    • David Howells's avatar
      netfs: Allocate multipage folios in the writepath · e2e2e839
      David Howells authored
      Allocate a multipage folio when copying data into the pagecache if possible
      if there's sufficient data to warrant it.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      e2e2e839
    • David Howells's avatar
      netfs: Make netfs_read_folio() handle streaming-write pages · 7f84a7b9
      David Howells authored
      netfs_read_folio() needs to handle partially-valid pages that are marked
      dirty, but not uptodate in the event that someone tries to read a page was
      used to cache data by a streaming write.
      
      In such a case, make netfs_read_folio() set up a bvec iterator that points
      to the parts of the folio that need filling and to a sink page for the data
      that should be discarded and use that instead of i_pages as the iterator to
      be written to.
      
      This requires netfs_rreq_unlock_folios() to convert the page into a normal
      dirty uptodate page, getting rid of the partial write record and bumping
      the group pointer over to folio->private.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      7f84a7b9
    • David Howells's avatar
      netfs: Provide func to copy data to pagecache for buffered write · c38f4e96
      David Howells authored
      Provide a netfs write helper, netfs_perform_write() to buffer data to be
      written in the pagecache and mark the modified folios dirty.
      
      It will perform "streaming writes" for folios that aren't currently
      resident, if possible, storing data in partially modified folios that are
      marked dirty, but not uptodate.  It will also tag pages as belonging to
      fs-specific write groups if so directed by the filesystem.
      
      This is derived from generic_perform_write(), but doesn't use
      ->write_begin() and ->write_end(), having that logic rolled in instead.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Jeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      c38f4e96
    • David Howells's avatar
      netfs: Dispatch write requests to process a writeback slice · 0e0f2dfe
      David Howells authored
      Dispatch one or more write reqeusts to process a writeback slice, where a
      slice is tailored more to logical block divisions within the file (such as
      crypto blocks, an object layout or cache granules) than the protocol RPC
      maximum capacity.
      
      The dispatch doesn't happen until throttling allows, at which point the
      entire writeback slice is processed and queued.  A slice may be written to
      multiple destinations (one or more servers and the local cache) and the
      writes to each destination might be split up along different lines.
      
      The writeback slice holds the required folios pinned.  An iov_iter is
      provided in netfs_write_request that describes the buffer to be used.  This
      may be part of the pagecache, may have auxiliary padding pages attached or
      may be a bounce buffer resulting from crypto or compression.  Consequently,
      the filesystem must not twiddle the folio markings directly.
      
      The following API is available to the filesystem:
      
       (1) The ->create_write_requests() method is called to ask the filesystem
           to create the requests it needs.  This is passed the writeback slice
           to be processed.
      
       (2) The filesystem should then call netfs_create_write_request() to create
           the requests it needs.
      
       (3) Once a request is initialised, netfs_queue_write_request() can be
           called to dispatch it asynchronously, if not completed immediately.
      
       (4) netfs_write_request_completed() should be called to note the
           completion of a request.
      
       (5) netfs_get_write_request() and netfs_put_write_request() are provided
           to refcount a request.  These take constants from the netfs_wreq_trace
           enum for logging into ftrace.
      
       (6) The ->free_write_request is method is called to ask the filesystem to
           clean up a request.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      0e0f2dfe
    • David Howells's avatar
      netfs: Prep to use folio->private for write grouping and streaming write · 9ebff83e
      David Howells authored
      Prepare to use folio->private to hold information write grouping and
      streaming write.  These are implemented in the same commit as they both
      make use of folio->private and will be both checked at the same time in
      several places.
      
      "Write grouping" involves ordering the writeback of groups of writes, such
      as is needed for ceph snaps.  A group is represented by a
      filesystem-supplied object which must contain a netfs_group struct.  This
      contains just a refcount and a pointer to a destructor.
      
      "Streaming write" is the storage of data in folios that are marked dirty,
      but not uptodate, to avoid unnecessary reads of data.  This is represented
      by a netfs_folio struct.  This contains the offset and length of the
      modified region plus the otherwise displaced write grouping pointer.
      
      The way folio->private is multiplexed is:
      
       (1) If private is NULL then neither is in operation on a dirty folio.
      
       (2) If private is set, with bit 0 clear, then this points to a group.
      
       (3) If private is set, with bit 0 set, then this points to a netfs_folio
           struct (with bit 0 AND'ed out).
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      9ebff83e
    • David Howells's avatar
      netfs: Make the refcounting of netfs_begin_read() easier to use · 4fcccc38
      David Howells authored
      Make the refcounting of netfs_begin_read() easier to use by not eating the
      caller's ref on the netfs_io_request it's given.  This makes it easier to
      use when we need to look in the request struct after.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      4fcccc38
    • David Howells's avatar
      netfs: Make netfs_put_request() handle a NULL pointer · 6ba22d8d
      David Howells authored
      Make netfs_put_request() just return if given a NULL request pointer.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      6ba22d8d
    • David Howells's avatar
      netfs: Add a hook to allow tell the netfs to update its i_size · c6dc54dd
      David Howells authored
      Add a hook for netfslib's write helpers to call to tell the network
      filesystem that it should update its i_size.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      c6dc54dd
    • David Howells's avatar
      netfs: Extend the netfs_io_*request structs to handle writes · 16af134c
      David Howells authored
      Modify the netfs_io_request struct to act as a point around which writes
      can be coordinated.  It represents and pins a range of pages that need
      writing and a list of regions of dirty data in that range of pages.
      
      If RMW is required, the original data can be downloaded into the bounce
      buffer, decrypted if necessary, the modifications made, then the modified
      data can be reencrypted/recompressed and sent back to the server.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      16af134c
    • David Howells's avatar
      netfs: Limit subrequest by size or number of segments · 768ddb1e
      David Howells authored
      Limit a subrequest to a maximum size and/or a maximum number of contiguous
      physical regions.  This permits, for instance, an subreq's iterator to be
      limited to the number of DMA'able segments that a large RDMA request can
      handle.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      768ddb1e
    • David Howells's avatar
      netfs: Add func to calculate pagecount/size-limited span of an iterator · cae932d3
      David Howells authored
      Add a function to work out how much of an ITER_BVEC or ITER_XARRAY iterator
      we can use in a pagecount-limited and size-limited span.  This will be
      used, for example, to limit the number of segments in a subrequest to the
      maximum number of elements that an RDMA transfer can handle.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      cae932d3
    • David Howells's avatar
      netfs: Provide tools to create a buffer in an xarray · 7d828a06
      David Howells authored
      Provide tools to create a buffer in an xarray, with a function to add new
      folios with a mark.  This will be used to create bounce buffer and can be
      used more easily to create a list of folios the span of which would require
      more than a page's worth of bio_vec structs.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      7d828a06
    • David Howells's avatar
      netfs: Add support for DIO buffering · 21d706d5
      David Howells authored
      Add a bvec array pointer and an iterator to netfs_io_request for either
      holding a copy of a DIO iterator or a list of all the bits of buffer
      pointed to by a DIO iterator.
      
      There are two problems:  Firstly, if an iovec-class iov_iter is passed to
      ->read_iter() or ->write_iter(), this cannot be passed directly to
      kernel_sendmsg() or kernel_recvmsg() as that may cause locking recursion if
      a fault is generated, so we need to keep track of the pages involved
      separately.
      
      Secondly, if the I/O is asynchronous, we must copy the iov_iter describing
      the buffer before returning to the caller as it may be immediately
      deallocated.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      21d706d5
  2. 24 Dec, 2023 15 commits
  3. 23 Dec, 2023 3 commits
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2023-12-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3f82f1c3
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
      
       - Fix a secondary CPUs enumeration regression caused by creative MADT
         APIC table entries on certain systems.
      
       - Fix a race in the NOP-patcher that can spuriously trigger crashes on
         bootup.
      
       - Fix a bootup failure regression caused by the parallel bringup code,
         caused by firmware inconsistency between the APIC initialization
         states of the boot and secondary CPUs, on certain systems.
      
      * tag 'x86-urgent-2023-12-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/acpi: Handle bogus MADT APIC tables gracefully
        x86/alternatives: Disable interrupts and sync when optimizing NOPs in place
        x86/alternatives: Sync core before enabling interrupts
        x86/smpboot/64: Handle X2APIC BIOS inconsistency gracefully
      3f82f1c3
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · f969c914
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Four small fixes, three in drivers with the core one adding a batch
        indicator (for drivers which use it) to the error handler"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: ufs: core: Let the sq_lock protect sq_tail_slot access
        scsi: ufs: qcom: Return ufs_qcom_clk_scale_*() errors in ufs_qcom_clk_scale_notify()
        scsi: core: Always send batch on reset or error handling command
        scsi: bnx2fc: Fix skb double free in bnx2fc_rcv()
      f969c914
    • Linus Torvalds's avatar
      Merge tag 'usb-6.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 4b2ee6d2
      Linus Torvalds authored
      Pull USB / Thunderbolt fixes from Greg KH:
       "Here are some small bugfixes and new device ids for USB and
        Thunderbolt drivers for 6.7-rc7. Included in here are:
      
         - new usb-serial device ids
      
         - thunderbolt driver fixes
      
         - typec driver fix
      
         - usb-storage driver quirk added
      
         - fotg210 driver fix
      
        All of these have been in linux-next with no reported issues"
      
      * tag 'usb-6.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        USB: serial: option: add Quectel EG912Y module support
        USB: serial: ftdi_sio: update Actisense PIDs constant names
        usb: fotg210-hcd: delete an incorrect bounds test
        usb-storage: Add quirk for incorrect WP on Kingston DT Ultimate 3.0 G3
        usb: typec: ucsi: fix gpio-based orientation detection
        net: usb: ax88179_178a: avoid failed operations when device is disconnected
        USB: serial: option: add Quectel RM500Q R13 firmware support
        USB: serial: option: add Foxconn T99W265 with new baseline
        thunderbolt: Fix minimum allocated USB 3.x and PCIe bandwidth
        thunderbolt: Fix memory leak in margining_port_remove()
      4b2ee6d2