1. 11 Jan, 2024 1 commit
  2. 09 Jan, 2024 2 commits
  3. 05 Jan, 2024 4 commits
    • David Howells's avatar
      netfs: Fix the loop that unmarks folios after writing to the cache · 807c6d09
      David Howells authored
      In the loop in netfs_rreq_unmark_after_write() that removes the PG_fscache
      from folios after they've been written to the cache, as soon as we remove
      the mark from a multipage folio, it can get split - and then we might see a
      fragment of folio again.
      
      Guard against this by advancing the 'unlocked' tracker to the index of the
      last page in the folio to avoid a double removal of the PG_fscache mark.
      Reported-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Matthew Wilcox <willy@infradead.org>
      cc: linux-afs@lists.infradead.org
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      807c6d09
    • David Howells's avatar
      netfs: Fix interaction between write-streaming and cachefiles culling · 92a714d7
      David Howells authored
      An issue can occur between write-streaming (storing dirty data in partial
      non-uptodate pages) and a cachefiles object being culled to make space.
      The problem occurs because the cache object is only marked in use while
      there are files open using it.  Once it has been released, it can be culled
      and the cookie marked disabled.
      
      At this point, a streaming write is permitted to occur (if the cache is
      active, we require pages to be prefetched and cached), but the cache can
      become active again before this gets flushed out - and then two effects can
      occur:
      
       (1) The cache may be asked to write out a region that's less than its DIO
           block size (assumed by cachefiles to be PAGE_SIZE) - and this causes
           one of two debugging statements to be emitted.
      
       (2) netfs_how_to_modify() gets confused because it sees a page that isn't
           allowed to be non-uptodate being uptodate and tries to prefetch it -
           leading to a warning that PG_fscache is set twice.
      
      Fix this by the following means:
      
       (1) Add a netfs_inode flag to disallow write-streaming to an inode and set
           it if we ever do local caching of that inode.  It remains set for the
           lifetime of that inode - even if the cookie becomes disabled.
      
       (2) If the no-write-streaming flag is set, then make netfs_how_to_modify()
           always want to prefetch instead.
      
       (3) If netfs_how_to_modify() decides it wants to prefetch a folio, but
           that folio has write-streamed data in it, then it requires the folio
           be flushed first.
      
       (4) Export a counter of the number of times we wanted to prefetch a
           non-uptodate page, but found it had write-streamed data in it.
      
       (5) Export a counter of the number of times we cancelled a write to the
           cache because it didn't DIO align and remove the debug statements.
      Reported-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Jeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-erofs@lists.ozlabs.org
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      92a714d7
    • David Howells's avatar
      netfs: Count DIO writes · 4088e389
      David Howells authored
      Provide a counter for DIO writes to match that for DIO reads.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Jeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      4088e389
    • David Howells's avatar
      netfs: Mark netfs_unbuffered_write_iter_locked() static · 0e4d464c
      David Howells authored
      Mark netfs_unbuffered_write_iter_locked() static as it's only called from
      the file in which it is defined.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Jeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      0e4d464c
  4. 04 Jan, 2024 5 commits
  5. 03 Jan, 2024 2 commits
  6. 28 Dec, 2023 26 commits
    • Christian Brauner's avatar
      Merge tag 'netfs-lib-20231228' of... · 86fb5941
      Christian Brauner authored
      Merge tag 'netfs-lib-20231228' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
      
      Pull netfs updates from David Howells:
      
      The main aims of these patches are to get high-level I/O and knowledge of
      the pagecache out of the filesystem drivers as much as possible and to get
      rid, as much of possible, of the knowledge that pages/folios exist.
      Further, I would like to see ->write_begin, ->write_end and
      ->launder_folio go away.
      
      Features that are added by these patches to that which is already there in
      netfslib:
      
       (1) NFS-style (and Ceph-style) locking around DIO vs buffered I/O calls to
           prevent these from happening at the same time.  mmap'd I/O can, of
           necessity, happen at any time ignoring these locks.
      
       (2) Support for unbuffered I/O.  The data is kept in the bounce buffer and
           the pagecache is not used.  This can be turned on with an inode flag.
      
       (3) Support for direct I/O.  This is basically unbuffered I/O with some
           extra restrictions and no RMW.
      
       (4) Support for using a bounce buffer in an operation.  The bounce buffer
           may be bigger than the target data/buffer, allowing for crypto
           rounding.
      
       (5) ->write_begin() and ->write_end() are ignored in favour of merging all
           of that into one function, netfs_perform_write(), thereby avoiding the
           function pointer traversals.
      
       (6) Support for write-through caching in the pagecache.
           netfs_perform_write() adds the pages is modifies to an I/O operation
           as it goes and directly marks them writeback rather than dirty.  When
           writing back from write-through, it limits the range written back.
           This should allow CIFS to deal with byte-range mandatory locks
           correctly.
      
       (7) O_*SYNC and RWF_*SYNC writes use write-through rather than writing to
           the pagecache and then flushing afterwards.  An AIO O_*SYNC write will
           notify of completion when the sub-writes all complete.
      
       (8) Support for write-streaming where modifed data is held in !uptodate
           folios, with a private struct attached indicating the range that is
           valid.
      
       (9) Support for write grouping, multiplexing a pointer to a group in the
           folio private data with the write-streaming data.  The writepages
           algorithm only writes stuff back that's in the nominated group.  This
           is intended for use by Ceph to write is snaps in order.
      
      (10) Skipping reads for which we know the server could only supply zeros or
           EOF (for instance if we've done a local write that leaves a hole in
           the file and extends the local inode size).
      
      General notes:
      
       (1) The fscache module is merged into the netfslib module to avoid cyclic
           exported symbol usage that prevents either module from being loaded.
      
       (2) Some helpers from fscache are reassigned to netfslib by name.
      
       (3) netfslib now makes use of folio->private, which means the filesystem
           can't use it.
      
       (4) The filesystem provides wrappers to call the write helpers, allowing
           it to do pre-validation, oplock/capability fetching and the passing in
           of write group info.
      
       (5) I want to try flushing the data when tearing down an inode before
           invalidating it to try and render launder_folio unnecessary.
      
       (6) Write-through caching will generate and dispatch write subrequests as
           it gathers enough data to hit wsize and has whole pages that at least
           span that size.  This needs to be a bit more flexible, allowing for a
           filesystem such as CIFS to have a variable wsize.
      
       (7) The filesystem driver is just given read and write calls with an
           iov_iter describing the data/buffer to use.  Ideally, they don't see
           pages or folios at all.  A function, extract_iter_to_sg(), is already
           available to decant part of an iterator into a scatterlist for crypto
           purposes.
      
      AFS notes:
      
       (1) I pushed a pair of patches that clean up the trace header down to the
           base so that they can be shared with another branch.
      
      9P notes:
      
       (1) Most of xfstests now pass - more, in fact, since upstream 9p lacks a
           writepages method and can't handle mmap writes.  An occasional oops
           (and sometimes panic) happens somewhere in the pathwalk/FID handling
           code that is unrelated to these changes.
      
       (2) Writes should now occur in larger-than-page-sized chunks.
      
       (3) It should be possible to turn on multipage folio support in 9P now.
      
      All in all these patches remove a little over 800 lines from AFS, 300
      from 9P, albeit with around 3000 lines added to netfs. Hopefully, I will
      be able to remove a bunch of lines from Ceph too.
      
      I've split the CIFS patches out to a separate branch, cifs-netfs, where
      a further 2000+ lines are removed.  I can run a certain amount of
      xfstests on CIFS, though I'm running into ksmbd issues and not all the
      tests work correctly because of issues between fallocate and what the
      SMB protocol actually supports.
      
      I've also dropped the content-crypto patches out for the moment as
      they're only usable by the ceph changes which I'm still working on.
      
      The patch to use PG_writeback instead of PG_fscache for writing to the
      cache has also been deferred, pending 9p, afs, ceph and cifs all being
      converted.
      
      * tag 'netfs-lib-20231228' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: (40 commits)
        9p: Use netfslib read/write_iter
        afs: Use the netfs write helpers
        netfs: Export the netfs_sreq tracepoint
        netfs: Optimise away reads above the point at which there can be no data
        netfs: Implement a write-through caching option
        netfs: Provide a launder_folio implementation
        netfs: Provide a writepages implementation
        netfs, cachefiles: Pass upper bound length to allow expansion
        netfs: Provide netfs_file_read_iter()
        netfs: Allow buffered shared-writeable mmap through netfs_page_mkwrite()
        netfs: Implement buffered write API
        netfs: Implement unbuffered/DIO write support
        netfs: Implement unbuffered/DIO read support
        netfs: Allocate multipage folios in the writepath
        netfs: Make netfs_read_folio() handle streaming-write pages
        netfs: Provide func to copy data to pagecache for buffered write
        netfs: Dispatch write requests to process a writeback slice
        netfs: Prep to use folio->private for write grouping and streaming write
        netfs: Make the refcounting of netfs_begin_read() easier to use
        netfs: Make netfs_put_request() handle a NULL pointer
        ...
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      86fb5941
    • David Howells's avatar
      9p: Use netfslib read/write_iter · 80105ed2
      David Howells authored
      Use netfslib's read and write iteration helpers, allowing netfslib to take
      over the management of the page cache for 9p files and to manage local disk
      caching.  In particular, this eliminates write_begin, write_end, writepage
      and all mentions of struct page and struct folio from 9p.
      
      Note that netfslib now offers the possibility of write-through caching if
      that is desirable for 9p: just set the NETFS_ICTX_WRITETHROUGH flag in
      v9inode->netfs.flags in v9fs_set_netfs_context().
      
      Note also this is untested as I can't get ganesha.nfsd to correctly parse
      the config to turn on 9p support.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: Eric Van Hensbergen <ericvh@kernel.org>
      cc: Latchesar Ionkov <lucho@ionkov.net>
      cc: Dominique Martinet <asmadeus@codewreck.org>
      cc: Christian Schoenebeck <linux_oss@crudebyte.com>
      cc: v9fs@lists.linux.dev
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      80105ed2
    • David Howells's avatar
      afs: Use the netfs write helpers · 3560358a
      David Howells authored
      Make afs use the netfs write helpers.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      3560358a
    • David Howells's avatar
      netfs: Export the netfs_sreq tracepoint · 545b135b
      David Howells authored
      Export the netfs_sreq tracepoint so that it can be called directly from
      client filesystems/cache backend modules.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Jeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      545b135b
    • David Howells's avatar
      netfs: Optimise away reads above the point at which there can be no data · 100ccd18
      David Howells authored
      Track the file position above which the server is not expected to have any
      data (the "zero point") and preemptively assume that we can satisfy
      requests by filling them with zeroes locally rather than attempting to
      download them if they're over that line - even if we've written data back
      to the server.  Assume that any data that was written back above that
      position is held in the local cache.  Note that we have to split requests
      that straddle the line.
      
      Make use of this to optimise away some reads from the server.  We need to
      set the zero point in the following circumstances:
      
       (1) When we see an extant remote inode and have no cache for it, we set
           the zero_point to i_size.
      
       (2) On local inode creation, we set zero_point to 0.
      
       (3) On local truncation down, we reduce zero_point to the new i_size if
           the new i_size is lower.
      
       (4) On local truncation up, we don't change zero_point.
      
       (5) On local modification, we don't change zero_point.
      
       (6) On remote invalidation, we set zero_point to the new i_size.
      
       (7) If stored data is discarded from the pagecache or culled from fscache,
           we must set zero_point above that if the data also got written to the
           server.
      
       (8) If dirty data is written back to the server, but not fscache, we must
           set zero_point above that.
      
       (9) If a direct I/O write is made, set zero_point above that.
      
      Assuming the above, any read from the server at or above the zero_point
      position will return all zeroes.
      
      The zero_point value can be stored in the cache, provided the above rules
      are applied to it by any code that culls part of the local cache.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Jeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      100ccd18
    • David Howells's avatar
      netfs: Implement a write-through caching option · 41d8e767
      David Howells authored
      Provide a flag whereby a filesystem may request that cifs_perform_write()
      perform write-through caching.  This involves putting pages directly into
      writeback rather than dirty and attaching them to a write operation as we
      go.
      
      Further, the writes being made are limited to the byte range being written
      rather than whole folios being written.  This can be used by cifs, for
      example, to deal with strict byte-range locking.
      
      This can't be used with content encryption as that may require expansion of
      the write RPC beyond the write being made.
      
      This doesn't affect writes via mmap - those are written back in the normal
      way; similarly failed writethrough writes are marked dirty and left to
      writeback to retry.  Another option would be to simply invalidate them, but
      the contents can be simultaneously accessed by read() and through mmap.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Jeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      41d8e767
    • David Howells's avatar
      netfs: Provide a launder_folio implementation · 4a79616c
      David Howells authored
      Provide a launder_folio implementation for netfslib.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      4a79616c
    • David Howells's avatar
      netfs: Provide a writepages implementation · 62c3b748
      David Howells authored
      Provide an implementation of writepages for network filesystems to delegate
      to.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      62c3b748
    • David Howells's avatar
      netfs, cachefiles: Pass upper bound length to allow expansion · e0ace6ca
      David Howells authored
      Make netfslib pass the maximum length to the ->prepare_write() op to tell
      the cache how much it can expand the length of a write to.  This allows a
      write to the server at the end of a file to be limited to a few bytes
      whilst writing an entire block to the cache (something required by direct
      I/O).
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      e0ace6ca
    • David Howells's avatar
      netfs: Provide netfs_file_read_iter() · 80645bd4
      David Howells authored
      Provide a top-level-ish function that can be pointed to directly by
      ->read_iter file op.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      80645bd4
    • David Howells's avatar
      netfs: Allow buffered shared-writeable mmap through netfs_page_mkwrite() · 102a7e2c
      David Howells authored
      Provide an entry point to delegate a filesystem's ->page_mkwrite() to.
      This checks for conflicting writes, then attached any netfs-specific group
      marking (e.g. ceph snap) to the page to be considered dirty.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      102a7e2c
    • David Howells's avatar
      netfs: Implement buffered write API · 938e13a7
      David Howells authored
      Institute a netfs write helper, netfs_file_write_iter(), to be pointed at
      by the network filesystem ->write_iter() call.  Make it handled buffered
      writes by calling the previously defined netfs_perform_write() to copy the
      source data into the pagecache.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      938e13a7
    • David Howells's avatar
      netfs: Implement unbuffered/DIO write support · 153a9961
      David Howells authored
      Implement support for unbuffered writes and direct I/O writes.  If the
      write is misaligned with respect to the fscrypt block size, then RMW cycles
      are performed if necessary.  DIO writes are a special case of unbuffered
      writes with extra restriction imposed, such as block size alignment
      requirements.
      
      Also provide a field that can tell the code to add some extra space onto
      the bounce buffer for use by the filesystem in the case of a
      content-encrypted file.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      153a9961
    • David Howells's avatar
      netfs: Implement unbuffered/DIO read support · 016dc851
      David Howells authored
      Implement support for unbuffered and DIO reads in the netfs library,
      utilising the existing read helper code to do block splitting and
      individual queuing.  The code also handles extraction of the destination
      buffer from the supplied iterator, allowing async unbuffered reads to take
      place.
      
      The read will be split up according to the rsize setting and, if supplied,
      the ->clamp_length() method.  Note that the next subrequest will be issued
      as soon as issue_op returns, without waiting for previous ones to finish.
      The network filesystem needs to pause or handle queuing them if it doesn't
      want to fire them all at the server simultaneously.
      
      Once all the subrequests have finished, the state will be assessed and the
      amount of data to be indicated as having being obtained will be
      determined.  As the subrequests may finish in any order, if an intermediate
      subrequest is short, any further subrequests may be copied into the buffer
      and then abandoned.
      
      In the future, this will also take care of doing an unbuffered read from
      encrypted content, with the decryption being done by the library.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Jeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      016dc851
    • David Howells's avatar
      netfs: Allocate multipage folios in the writepath · e2e2e839
      David Howells authored
      Allocate a multipage folio when copying data into the pagecache if possible
      if there's sufficient data to warrant it.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      e2e2e839
    • David Howells's avatar
      netfs: Make netfs_read_folio() handle streaming-write pages · 7f84a7b9
      David Howells authored
      netfs_read_folio() needs to handle partially-valid pages that are marked
      dirty, but not uptodate in the event that someone tries to read a page was
      used to cache data by a streaming write.
      
      In such a case, make netfs_read_folio() set up a bvec iterator that points
      to the parts of the folio that need filling and to a sink page for the data
      that should be discarded and use that instead of i_pages as the iterator to
      be written to.
      
      This requires netfs_rreq_unlock_folios() to convert the page into a normal
      dirty uptodate page, getting rid of the partial write record and bumping
      the group pointer over to folio->private.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      7f84a7b9
    • David Howells's avatar
      netfs: Provide func to copy data to pagecache for buffered write · c38f4e96
      David Howells authored
      Provide a netfs write helper, netfs_perform_write() to buffer data to be
      written in the pagecache and mark the modified folios dirty.
      
      It will perform "streaming writes" for folios that aren't currently
      resident, if possible, storing data in partially modified folios that are
      marked dirty, but not uptodate.  It will also tag pages as belonging to
      fs-specific write groups if so directed by the filesystem.
      
      This is derived from generic_perform_write(), but doesn't use
      ->write_begin() and ->write_end(), having that logic rolled in instead.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Jeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      c38f4e96
    • David Howells's avatar
      netfs: Dispatch write requests to process a writeback slice · 0e0f2dfe
      David Howells authored
      Dispatch one or more write reqeusts to process a writeback slice, where a
      slice is tailored more to logical block divisions within the file (such as
      crypto blocks, an object layout or cache granules) than the protocol RPC
      maximum capacity.
      
      The dispatch doesn't happen until throttling allows, at which point the
      entire writeback slice is processed and queued.  A slice may be written to
      multiple destinations (one or more servers and the local cache) and the
      writes to each destination might be split up along different lines.
      
      The writeback slice holds the required folios pinned.  An iov_iter is
      provided in netfs_write_request that describes the buffer to be used.  This
      may be part of the pagecache, may have auxiliary padding pages attached or
      may be a bounce buffer resulting from crypto or compression.  Consequently,
      the filesystem must not twiddle the folio markings directly.
      
      The following API is available to the filesystem:
      
       (1) The ->create_write_requests() method is called to ask the filesystem
           to create the requests it needs.  This is passed the writeback slice
           to be processed.
      
       (2) The filesystem should then call netfs_create_write_request() to create
           the requests it needs.
      
       (3) Once a request is initialised, netfs_queue_write_request() can be
           called to dispatch it asynchronously, if not completed immediately.
      
       (4) netfs_write_request_completed() should be called to note the
           completion of a request.
      
       (5) netfs_get_write_request() and netfs_put_write_request() are provided
           to refcount a request.  These take constants from the netfs_wreq_trace
           enum for logging into ftrace.
      
       (6) The ->free_write_request is method is called to ask the filesystem to
           clean up a request.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      0e0f2dfe
    • David Howells's avatar
      netfs: Prep to use folio->private for write grouping and streaming write · 9ebff83e
      David Howells authored
      Prepare to use folio->private to hold information write grouping and
      streaming write.  These are implemented in the same commit as they both
      make use of folio->private and will be both checked at the same time in
      several places.
      
      "Write grouping" involves ordering the writeback of groups of writes, such
      as is needed for ceph snaps.  A group is represented by a
      filesystem-supplied object which must contain a netfs_group struct.  This
      contains just a refcount and a pointer to a destructor.
      
      "Streaming write" is the storage of data in folios that are marked dirty,
      but not uptodate, to avoid unnecessary reads of data.  This is represented
      by a netfs_folio struct.  This contains the offset and length of the
      modified region plus the otherwise displaced write grouping pointer.
      
      The way folio->private is multiplexed is:
      
       (1) If private is NULL then neither is in operation on a dirty folio.
      
       (2) If private is set, with bit 0 clear, then this points to a group.
      
       (3) If private is set, with bit 0 set, then this points to a netfs_folio
           struct (with bit 0 AND'ed out).
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      9ebff83e
    • David Howells's avatar
      netfs: Make the refcounting of netfs_begin_read() easier to use · 4fcccc38
      David Howells authored
      Make the refcounting of netfs_begin_read() easier to use by not eating the
      caller's ref on the netfs_io_request it's given.  This makes it easier to
      use when we need to look in the request struct after.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      4fcccc38
    • David Howells's avatar
      netfs: Make netfs_put_request() handle a NULL pointer · 6ba22d8d
      David Howells authored
      Make netfs_put_request() just return if given a NULL request pointer.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      6ba22d8d
    • David Howells's avatar
      netfs: Add a hook to allow tell the netfs to update its i_size · c6dc54dd
      David Howells authored
      Add a hook for netfslib's write helpers to call to tell the network
      filesystem that it should update its i_size.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      c6dc54dd
    • David Howells's avatar
      netfs: Extend the netfs_io_*request structs to handle writes · 16af134c
      David Howells authored
      Modify the netfs_io_request struct to act as a point around which writes
      can be coordinated.  It represents and pins a range of pages that need
      writing and a list of regions of dirty data in that range of pages.
      
      If RMW is required, the original data can be downloaded into the bounce
      buffer, decrypted if necessary, the modifications made, then the modified
      data can be reencrypted/recompressed and sent back to the server.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      16af134c
    • David Howells's avatar
      netfs: Limit subrequest by size or number of segments · 768ddb1e
      David Howells authored
      Limit a subrequest to a maximum size and/or a maximum number of contiguous
      physical regions.  This permits, for instance, an subreq's iterator to be
      limited to the number of DMA'able segments that a large RDMA request can
      handle.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      768ddb1e
    • David Howells's avatar
      netfs: Add func to calculate pagecount/size-limited span of an iterator · cae932d3
      David Howells authored
      Add a function to work out how much of an ITER_BVEC or ITER_XARRAY iterator
      we can use in a pagecount-limited and size-limited span.  This will be
      used, for example, to limit the number of segments in a subrequest to the
      maximum number of elements that an RDMA transfer can handle.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      cae932d3
    • David Howells's avatar
      netfs: Provide tools to create a buffer in an xarray · 7d828a06
      David Howells authored
      Provide tools to create a buffer in an xarray, with a function to add new
      folios with a mark.  This will be used to create bounce buffer and can be
      used more easily to create a list of folios the span of which would require
      more than a page's worth of bio_vec structs.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      cc: linux-cachefs@redhat.com
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-mm@kvack.org
      7d828a06