- 11 Jan, 2024 1 commit
-
-
Christian Brauner authored
Merge tag 'netfs-lib-20240109' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into vfs.netfs Pull netfs updates from David Howells: A few follow-up fixes for the netfs work for this cycle. * tag 'netfs-lib-20240109' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: netfs: Fix wrong #ifdef hiding wait cachefiles: Fix signed/unsigned mixup netfs: Fix the loop that unmarks folios after writing to the cache netfs: Fix interaction between write-streaming and cachefiles culling netfs: Count DIO writes netfs: Mark netfs_unbuffered_write_iter_locked() static Tested-by: Marc Dionne <marc.dionne@auristor.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
- 09 Jan, 2024 2 commits
-
-
David Howells authored
netfs_writepages_begin() has the wait on the fscache folio conditional on CONFIG_NETFS_FSCACHE - which doesn't exist. Fix it to be conditional on CONFIG_FSCACHE instead. Fixes: 62c3b748 ("netfs: Provide a writepages implementation") Reported-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Matthew Wilcox <willy@infradead.org> cc: linux-afs@lists.infradead.org cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/20240109083257.GK132648@kernel.org/
-
David Howells authored
In __cachefiles_prepare_write(), the start and pos variables were made unsigned 64-bit so that the casts in the checking could be got rid of - which should be fine since absolute file offsets can't be negative, except that an error code may be obtained from vfs_llseek(), which *would* be negative. This breaks the error check. Fix this for now by reverting pos and start to be signed and putting back the casts. Unfortunately, the error value checks cannot be replaced with IS_ERR_VALUE() as long might be 32-bits. Fixes: 7097c964 ("cachefiles: Fix __cachefiles_prepare_write()") Reported-by: Simon Horman <horms@kernel.org> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202401071152.DbKqMQMu-lkp@intel.com/Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> cc: Yiqun Leng <yqleng@linux.alibaba.com> cc: Jia Zhu <zhujia.zj@bytedance.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-erofs@lists.ozlabs.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
- 05 Jan, 2024 4 commits
-
-
David Howells authored
In the loop in netfs_rreq_unmark_after_write() that removes the PG_fscache from folios after they've been written to the cache, as soon as we remove the mark from a multipage folio, it can get split - and then we might see a fragment of folio again. Guard against this by advancing the 'unlocked' tracker to the index of the last page in the folio to avoid a double removal of the PG_fscache mark. Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Matthew Wilcox <willy@infradead.org> cc: linux-afs@lists.infradead.org cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
An issue can occur between write-streaming (storing dirty data in partial non-uptodate pages) and a cachefiles object being culled to make space. The problem occurs because the cache object is only marked in use while there are files open using it. Once it has been released, it can be culled and the cookie marked disabled. At this point, a streaming write is permitted to occur (if the cache is active, we require pages to be prefetched and cached), but the cache can become active again before this gets flushed out - and then two effects can occur: (1) The cache may be asked to write out a region that's less than its DIO block size (assumed by cachefiles to be PAGE_SIZE) - and this causes one of two debugging statements to be emitted. (2) netfs_how_to_modify() gets confused because it sees a page that isn't allowed to be non-uptodate being uptodate and tries to prefetch it - leading to a warning that PG_fscache is set twice. Fix this by the following means: (1) Add a netfs_inode flag to disallow write-streaming to an inode and set it if we ever do local caching of that inode. It remains set for the lifetime of that inode - even if the cookie becomes disabled. (2) If the no-write-streaming flag is set, then make netfs_how_to_modify() always want to prefetch instead. (3) If netfs_how_to_modify() decides it wants to prefetch a folio, but that folio has write-streamed data in it, then it requires the folio be flushed first. (4) Export a counter of the number of times we wanted to prefetch a non-uptodate page, but found it had write-streamed data in it. (5) Export a counter of the number of times we cancelled a write to the cache because it didn't DIO align and remove the debug statements. Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-erofs@lists.ozlabs.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
Provide a counter for DIO writes to match that for DIO reads. Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
Mark netfs_unbuffered_write_iter_locked() static as it's only called from the file in which it is defined. Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
- 04 Jan, 2024 5 commits
-
-
Christian Brauner authored
Merge tag 'netfs-lib-20240104' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull netfs updates from David Howells: A few follow-up fixes for the netfs work for this cycle. * tag 'netfs-lib-20240104' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: netfs: Fix proc/fs/fscache symlink to point to "netfs" not "../netfs" netfs: Rearrange netfs_io_subrequest to put request pointer first 9p: Use length of data written to the server in preference to error 9p: Do a couple of cleanups 9p: Fix initialisation of netfs_inode for 9p cachefiles: Fix __cachefiles_prepare_write() Signed-off-by: Christian Brauner <brauner@kernel.org>
-
David Howells authored
Fix the proc/fs/fscache symlink to point to "netfs" not "../netfs". Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: Christian Brauner <christian@brauner.io> cc: linux-fsdevel@vger.kernel.org cc: linux-cachefs@redhat.com
-
David Howells authored
Rearrange the netfs_io_subrequest struct to put the netfs_io_request pointer (rreq) first. This then allows netfs_io_subrequest to be put in a union with a pointer to a wrapper around netfs_io_request. This will be useful in the future for cifs and maybe ceph. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
In v9fs_upload_to_server(), we pass the error to netfslib to terminate the subreq rather than the amount of data written - even if we did actually write something. Further, we assume that the write is always entirely done if successful - but it might have been partially complete - as returned by p9_client_write(), but we ignore that. Fix this by indicating the amount written by preference and only returning the error if we didn't write anything. (We might want to return both in future if both are available as this might be useful as to whether we retry or not.) Suggested-by: Dominique Martinet <asmadeus@codewreck.org> Link: https://lore.kernel.org/r/ZZULNQAZ0n0WQv7p@codewreck.org/Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Dominique Martinet <asmadeus@codewreck.org> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: v9fs@lists.linux.dev cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org
-
David Howells authored
Do a couple of cleanups to 9p: (1) Remove a couple of unused variables. (2) Turn a BUG_ON() into a warning, consolidate with another warning and make the warning message include the inode number rather than whatever's in i_private (which will get hashed anyway). Suggested-by: Dominique Martinet <asmadeus@codewreck.org> Link: https://lore.kernel.org/r/ZZULNQAZ0n0WQv7p@codewreck.org/Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Dominique Martinet <asmadeus@codewreck.org> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: v9fs@lists.linux.dev cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org
-
- 03 Jan, 2024 2 commits
-
-
David Howells authored
The 9p filesystem is calling netfs_inode_init() in v9fs_init_inode() - before the struct inode fields have been initialised from the obtained file stats (ie. after v9fs_stat2inode*() has been called), but netfslib wants to set a couple of its fields from i_size. Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Marc Dionne <marc.dionne@auristor.com> Tested-by: Dominique Martinet <asmadeus@codewreck.org> Acked-by: Dominique Martinet <asmadeus@codewreck.org> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: v9fs@lists.linux.dev cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org
-
David Howells authored
Fix __cachefiles_prepare_write() to correctly determine whether the requested write will fit correctly with the DIO alignment. Reported-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Yiqun Leng <yqleng@linux.alibaba.com> Tested-by: Jia Zhu <zhujia.zj@bytedance.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-erofs@lists.ozlabs.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
- 28 Dec, 2023 26 commits
-
-
Christian Brauner authored
Merge tag 'netfs-lib-20231228' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull netfs updates from David Howells: The main aims of these patches are to get high-level I/O and knowledge of the pagecache out of the filesystem drivers as much as possible and to get rid, as much of possible, of the knowledge that pages/folios exist. Further, I would like to see ->write_begin, ->write_end and ->launder_folio go away. Features that are added by these patches to that which is already there in netfslib: (1) NFS-style (and Ceph-style) locking around DIO vs buffered I/O calls to prevent these from happening at the same time. mmap'd I/O can, of necessity, happen at any time ignoring these locks. (2) Support for unbuffered I/O. The data is kept in the bounce buffer and the pagecache is not used. This can be turned on with an inode flag. (3) Support for direct I/O. This is basically unbuffered I/O with some extra restrictions and no RMW. (4) Support for using a bounce buffer in an operation. The bounce buffer may be bigger than the target data/buffer, allowing for crypto rounding. (5) ->write_begin() and ->write_end() are ignored in favour of merging all of that into one function, netfs_perform_write(), thereby avoiding the function pointer traversals. (6) Support for write-through caching in the pagecache. netfs_perform_write() adds the pages is modifies to an I/O operation as it goes and directly marks them writeback rather than dirty. When writing back from write-through, it limits the range written back. This should allow CIFS to deal with byte-range mandatory locks correctly. (7) O_*SYNC and RWF_*SYNC writes use write-through rather than writing to the pagecache and then flushing afterwards. An AIO O_*SYNC write will notify of completion when the sub-writes all complete. (8) Support for write-streaming where modifed data is held in !uptodate folios, with a private struct attached indicating the range that is valid. (9) Support for write grouping, multiplexing a pointer to a group in the folio private data with the write-streaming data. The writepages algorithm only writes stuff back that's in the nominated group. This is intended for use by Ceph to write is snaps in order. (10) Skipping reads for which we know the server could only supply zeros or EOF (for instance if we've done a local write that leaves a hole in the file and extends the local inode size). General notes: (1) The fscache module is merged into the netfslib module to avoid cyclic exported symbol usage that prevents either module from being loaded. (2) Some helpers from fscache are reassigned to netfslib by name. (3) netfslib now makes use of folio->private, which means the filesystem can't use it. (4) The filesystem provides wrappers to call the write helpers, allowing it to do pre-validation, oplock/capability fetching and the passing in of write group info. (5) I want to try flushing the data when tearing down an inode before invalidating it to try and render launder_folio unnecessary. (6) Write-through caching will generate and dispatch write subrequests as it gathers enough data to hit wsize and has whole pages that at least span that size. This needs to be a bit more flexible, allowing for a filesystem such as CIFS to have a variable wsize. (7) The filesystem driver is just given read and write calls with an iov_iter describing the data/buffer to use. Ideally, they don't see pages or folios at all. A function, extract_iter_to_sg(), is already available to decant part of an iterator into a scatterlist for crypto purposes. AFS notes: (1) I pushed a pair of patches that clean up the trace header down to the base so that they can be shared with another branch. 9P notes: (1) Most of xfstests now pass - more, in fact, since upstream 9p lacks a writepages method and can't handle mmap writes. An occasional oops (and sometimes panic) happens somewhere in the pathwalk/FID handling code that is unrelated to these changes. (2) Writes should now occur in larger-than-page-sized chunks. (3) It should be possible to turn on multipage folio support in 9P now. All in all these patches remove a little over 800 lines from AFS, 300 from 9P, albeit with around 3000 lines added to netfs. Hopefully, I will be able to remove a bunch of lines from Ceph too. I've split the CIFS patches out to a separate branch, cifs-netfs, where a further 2000+ lines are removed. I can run a certain amount of xfstests on CIFS, though I'm running into ksmbd issues and not all the tests work correctly because of issues between fallocate and what the SMB protocol actually supports. I've also dropped the content-crypto patches out for the moment as they're only usable by the ceph changes which I'm still working on. The patch to use PG_writeback instead of PG_fscache for writing to the cache has also been deferred, pending 9p, afs, ceph and cifs all being converted. * tag 'netfs-lib-20231228' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: (40 commits) 9p: Use netfslib read/write_iter afs: Use the netfs write helpers netfs: Export the netfs_sreq tracepoint netfs: Optimise away reads above the point at which there can be no data netfs: Implement a write-through caching option netfs: Provide a launder_folio implementation netfs: Provide a writepages implementation netfs, cachefiles: Pass upper bound length to allow expansion netfs: Provide netfs_file_read_iter() netfs: Allow buffered shared-writeable mmap through netfs_page_mkwrite() netfs: Implement buffered write API netfs: Implement unbuffered/DIO write support netfs: Implement unbuffered/DIO read support netfs: Allocate multipage folios in the writepath netfs: Make netfs_read_folio() handle streaming-write pages netfs: Provide func to copy data to pagecache for buffered write netfs: Dispatch write requests to process a writeback slice netfs: Prep to use folio->private for write grouping and streaming write netfs: Make the refcounting of netfs_begin_read() easier to use netfs: Make netfs_put_request() handle a NULL pointer ... Signed-off-by: Christian Brauner <brauner@kernel.org>
-
David Howells authored
Use netfslib's read and write iteration helpers, allowing netfslib to take over the management of the page cache for 9p files and to manage local disk caching. In particular, this eliminates write_begin, write_end, writepage and all mentions of struct page and struct folio from 9p. Note that netfslib now offers the possibility of write-through caching if that is desirable for 9p: just set the NETFS_ICTX_WRITETHROUGH flag in v9inode->netfs.flags in v9fs_set_netfs_context(). Note also this is untested as I can't get ganesha.nfsd to correctly parse the config to turn on 9p support. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: v9fs@lists.linux.dev cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org
-
David Howells authored
Make afs use the netfs write helpers. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
Export the netfs_sreq tracepoint so that it can be called directly from client filesystems/cache backend modules. Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
Track the file position above which the server is not expected to have any data (the "zero point") and preemptively assume that we can satisfy requests by filling them with zeroes locally rather than attempting to download them if they're over that line - even if we've written data back to the server. Assume that any data that was written back above that position is held in the local cache. Note that we have to split requests that straddle the line. Make use of this to optimise away some reads from the server. We need to set the zero point in the following circumstances: (1) When we see an extant remote inode and have no cache for it, we set the zero_point to i_size. (2) On local inode creation, we set zero_point to 0. (3) On local truncation down, we reduce zero_point to the new i_size if the new i_size is lower. (4) On local truncation up, we don't change zero_point. (5) On local modification, we don't change zero_point. (6) On remote invalidation, we set zero_point to the new i_size. (7) If stored data is discarded from the pagecache or culled from fscache, we must set zero_point above that if the data also got written to the server. (8) If dirty data is written back to the server, but not fscache, we must set zero_point above that. (9) If a direct I/O write is made, set zero_point above that. Assuming the above, any read from the server at or above the zero_point position will return all zeroes. The zero_point value can be stored in the cache, provided the above rules are applied to it by any code that culls part of the local cache. Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
Provide a flag whereby a filesystem may request that cifs_perform_write() perform write-through caching. This involves putting pages directly into writeback rather than dirty and attaching them to a write operation as we go. Further, the writes being made are limited to the byte range being written rather than whole folios being written. This can be used by cifs, for example, to deal with strict byte-range locking. This can't be used with content encryption as that may require expansion of the write RPC beyond the write being made. This doesn't affect writes via mmap - those are written back in the normal way; similarly failed writethrough writes are marked dirty and left to writeback to retry. Another option would be to simply invalidate them, but the contents can be simultaneously accessed by read() and through mmap. Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
Provide a launder_folio implementation for netfslib. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
Provide an implementation of writepages for network filesystems to delegate to. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
Make netfslib pass the maximum length to the ->prepare_write() op to tell the cache how much it can expand the length of a write to. This allows a write to the server at the end of a file to be limited to a few bytes whilst writing an entire block to the cache (something required by direct I/O). Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
Provide a top-level-ish function that can be pointed to directly by ->read_iter file op. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
Provide an entry point to delegate a filesystem's ->page_mkwrite() to. This checks for conflicting writes, then attached any netfs-specific group marking (e.g. ceph snap) to the page to be considered dirty. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
Institute a netfs write helper, netfs_file_write_iter(), to be pointed at by the network filesystem ->write_iter() call. Make it handled buffered writes by calling the previously defined netfs_perform_write() to copy the source data into the pagecache. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
Implement support for unbuffered writes and direct I/O writes. If the write is misaligned with respect to the fscrypt block size, then RMW cycles are performed if necessary. DIO writes are a special case of unbuffered writes with extra restriction imposed, such as block size alignment requirements. Also provide a field that can tell the code to add some extra space onto the bounce buffer for use by the filesystem in the case of a content-encrypted file. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
Implement support for unbuffered and DIO reads in the netfs library, utilising the existing read helper code to do block splitting and individual queuing. The code also handles extraction of the destination buffer from the supplied iterator, allowing async unbuffered reads to take place. The read will be split up according to the rsize setting and, if supplied, the ->clamp_length() method. Note that the next subrequest will be issued as soon as issue_op returns, without waiting for previous ones to finish. The network filesystem needs to pause or handle queuing them if it doesn't want to fire them all at the server simultaneously. Once all the subrequests have finished, the state will be assessed and the amount of data to be indicated as having being obtained will be determined. As the subrequests may finish in any order, if an intermediate subrequest is short, any further subrequests may be copied into the buffer and then abandoned. In the future, this will also take care of doing an unbuffered read from encrypted content, with the decryption being done by the library. Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
Allocate a multipage folio when copying data into the pagecache if possible if there's sufficient data to warrant it. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
netfs_read_folio() needs to handle partially-valid pages that are marked dirty, but not uptodate in the event that someone tries to read a page was used to cache data by a streaming write. In such a case, make netfs_read_folio() set up a bvec iterator that points to the parts of the folio that need filling and to a sink page for the data that should be discarded and use that instead of i_pages as the iterator to be written to. This requires netfs_rreq_unlock_folios() to convert the page into a normal dirty uptodate page, getting rid of the partial write record and bumping the group pointer over to folio->private. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
Provide a netfs write helper, netfs_perform_write() to buffer data to be written in the pagecache and mark the modified folios dirty. It will perform "streaming writes" for folios that aren't currently resident, if possible, storing data in partially modified folios that are marked dirty, but not uptodate. It will also tag pages as belonging to fs-specific write groups if so directed by the filesystem. This is derived from generic_perform_write(), but doesn't use ->write_begin() and ->write_end(), having that logic rolled in instead. Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
Dispatch one or more write reqeusts to process a writeback slice, where a slice is tailored more to logical block divisions within the file (such as crypto blocks, an object layout or cache granules) than the protocol RPC maximum capacity. The dispatch doesn't happen until throttling allows, at which point the entire writeback slice is processed and queued. A slice may be written to multiple destinations (one or more servers and the local cache) and the writes to each destination might be split up along different lines. The writeback slice holds the required folios pinned. An iov_iter is provided in netfs_write_request that describes the buffer to be used. This may be part of the pagecache, may have auxiliary padding pages attached or may be a bounce buffer resulting from crypto or compression. Consequently, the filesystem must not twiddle the folio markings directly. The following API is available to the filesystem: (1) The ->create_write_requests() method is called to ask the filesystem to create the requests it needs. This is passed the writeback slice to be processed. (2) The filesystem should then call netfs_create_write_request() to create the requests it needs. (3) Once a request is initialised, netfs_queue_write_request() can be called to dispatch it asynchronously, if not completed immediately. (4) netfs_write_request_completed() should be called to note the completion of a request. (5) netfs_get_write_request() and netfs_put_write_request() are provided to refcount a request. These take constants from the netfs_wreq_trace enum for logging into ftrace. (6) The ->free_write_request is method is called to ask the filesystem to clean up a request. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
Prepare to use folio->private to hold information write grouping and streaming write. These are implemented in the same commit as they both make use of folio->private and will be both checked at the same time in several places. "Write grouping" involves ordering the writeback of groups of writes, such as is needed for ceph snaps. A group is represented by a filesystem-supplied object which must contain a netfs_group struct. This contains just a refcount and a pointer to a destructor. "Streaming write" is the storage of data in folios that are marked dirty, but not uptodate, to avoid unnecessary reads of data. This is represented by a netfs_folio struct. This contains the offset and length of the modified region plus the otherwise displaced write grouping pointer. The way folio->private is multiplexed is: (1) If private is NULL then neither is in operation on a dirty folio. (2) If private is set, with bit 0 clear, then this points to a group. (3) If private is set, with bit 0 set, then this points to a netfs_folio struct (with bit 0 AND'ed out). Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
Make the refcounting of netfs_begin_read() easier to use by not eating the caller's ref on the netfs_io_request it's given. This makes it easier to use when we need to look in the request struct after. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
Make netfs_put_request() just return if given a NULL request pointer. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
Add a hook for netfslib's write helpers to call to tell the network filesystem that it should update its i_size. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
Modify the netfs_io_request struct to act as a point around which writes can be coordinated. It represents and pins a range of pages that need writing and a list of regions of dirty data in that range of pages. If RMW is required, the original data can be downloaded into the bounce buffer, decrypted if necessary, the modifications made, then the modified data can be reencrypted/recompressed and sent back to the server. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
Limit a subrequest to a maximum size and/or a maximum number of contiguous physical regions. This permits, for instance, an subreq's iterator to be limited to the number of DMA'able segments that a large RDMA request can handle. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
Add a function to work out how much of an ITER_BVEC or ITER_XARRAY iterator we can use in a pagecount-limited and size-limited span. This will be used, for example, to limit the number of segments in a subrequest to the maximum number of elements that an RDMA transfer can handle. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-
David Howells authored
Provide tools to create a buffer in an xarray, with a function to add new folios with a mark. This will be used to create bounce buffer and can be used more easily to create a list of folios the span of which would require more than a page's worth of bio_vec structs. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
-