- 08 Jun, 2021 5 commits
-
-
Darrick J. Wong authored
Merge tag 'assorted-fixes-5.14-1_2021-06-03' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-5.14-merge2 xfs: assorted fixes for 5.14, part 1 This branch contains the first round of various small fixes for 5.14. * tag 'assorted-fixes-5.14-1_2021-06-03' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: don't take a spinlock unconditionally in the DIO fastpath xfs: mark xfs_bmap_set_attrforkoff static xfs: Remove redundant assignment to busy xfs: sort variable alphabetically to avoid repeated declaration
-
Darrick J. Wong authored
Merge tag 'unit-conversion-cleanups-5.14_2021-06-03' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-5.14-merge2 xfs: various unit conversions Crafting the realtime file extent size hint fixes revealed various opportunities to clean up unit conversions, so now that gets its own series. * tag 'unit-conversion-cleanups-5.14_2021-06-03' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: remove unnecessary shifts xfs: clean up open-coded fs block unit conversions
-
Dave Chinner authored
From: Dave Chinner <dchinner@redhat.com> Stephen Rothwell reported this compiler warning from linux-next: fs/xfs/libxfs/xfs_ialloc.c: In function 'xfs_difree_finobt': fs/xfs/libxfs/xfs_ialloc.c:2032:20: warning: unused variable 'agi' [-Wunused-variable] 2032 | struct xfs_agi *agi = agbp->b_addr; Which is fallout from agno -> perag conversions that were done in this function. xfs_check_agi_freecount() is the only user of "agi" in xfs_difree_finobt() now, and it only uses the agi to get the current free inode count. We hold that in the perag structure, so there's not need to directly reference the raw AGI to get this information. The btree cursor being passed to xfs_check_agi_freecount() has a reference to the perag being operated on, so use that directly in xfs_check_agi_freecount() rather than passing an AGI. Fixes: 7b13c515 ("xfs: use perag for ialloc btree cursors") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
-
Darrick J. Wong authored
Merge tag 'xfs-perag-conv-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs into xfs-5.14-merge2 xfs: initial agnumber -> perag conversions for shrink If we want to use active references to the perag to be able to gate shrink removing AGs and hence perags safely, we've got a fair bit of work to do actually use perags in all the places we need to. There's a lot of code that iterates ag numbers and then looks up perags from that, often multiple times for the same perag in the one operation. If we want to use reference counted perags for access control, then we need to convert all these uses to perag iterators, not agno iterators. [Patches 1-4] The first step of this is consolidating all the perag management - init, free, get, put, etc into a common location. THis is spread all over the place right now, so move it all into libxfs/xfs_ag.[ch]. This does expose kernel only bits of the perag to libxfs and hence userspace, so the structures and code is rearranged to minimise the number of ifdefs that need to be added to the userspace codebase. The perag iterator in xfs_icache.c is promoted to a first class API and expanded to the needs of the code as required. [Patches 5-10] These are the first basic perag iterator conversions and changes to pass the perag down the stack from those iterators where appropriate. A lot of this is obvious, simple changes, though in some places we stop passing the perag down the stack because the code enters into an as yet unconverted subsystem that still uses raw AGs. [Patches 11-16] These replace the agno passed in the btree cursor for per-ag btree operations with a perag that is passed to the cursor init function. The cursor takes it's own reference to the perag, and the reference is dropped when the cursor is deleted. Hence we get reference coverage for the entire time the cursor is active, even if the code that initialised the cursor drops it's reference before the cursor or any of it's children (duplicates) have been deleted. The first patch adds the perag infrastructure for the cursor, the next four patches convert a btree cursor at a time, and the last removes the agno from the cursor once it is unused. [Patches 17-21] These patches are a demonstration of the simplifications and cleanups that come from plumbing the perag through interfaces that select and then operate on a specific AG. In this case the inode allocation algorithm does up to three walks across all AGs before it either allocates an inode or fails. Two of these walks are purely just to select the AG, and even then it doesn't guarantee inode allocation success so there's a third walk if the selected AG allocation fails. These patches collapse the selection and allocation into a single loop, simplifies the error handling because xfs_dir_ialloc() always returns ENOSPC if no AG was selected for inode allocation or we fail to allocate an inode in any AG, gets rid of xfs_dir_ialloc() wrapper, converts inode allocation to run entirely from a single perag instance, and then factors xfs_dialloc() into a much, much simpler loop which is easy to understand. Hence we end up with the same inode allocation logic, but it only needs two complete iterations at worst, makes AG selection and allocation atomic w.r.t. shrink and chops out out over 100 lines of code from this hot code path. [Patch 22] Converts the unlink path to pass perags through it. There's more conversion work to be done, but this patchset gets through a large chunk of it in one hit. Most of the iterators are converted, so once this is solidified we can move on to converting these to active references for being able to free perags while the fs is still active. * tag 'xfs-perag-conv-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (23 commits) xfs: remove xfs_perag_t xfs: use perag through unlink processing xfs: clean up and simplify xfs_dialloc() xfs: inode allocation can use a single perag instance xfs: get rid of xfs_dir_ialloc() xfs: collapse AG selection for inode allocation xfs: simplify xfs_dialloc_select_ag() return values xfs: remove agno from btree cursor xfs: use perag for ialloc btree cursors xfs: convert allocbt cursors to use perags xfs: convert refcount btree cursor to use perags xfs: convert rmap btree cursor to using a perag xfs: add a perag to the btree cursor xfs: pass perags around in fsmap data dev functions xfs: push perags through the ag reservation callouts xfs: pass perags through to the busy extent code xfs: convert secondary superblock walk to use perags xfs: convert xfs_iwalk to use perag references xfs: convert raw ag walks to use for_each_perag xfs: make for_each_perag... a first class citizen ...
-
Darrick J. Wong authored
Merge tag 'xfs-buf-bulk-alloc-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs into xfs-5.14-merge2 xfs: buffer cache bulk page allocation This patchset makes use of the new bulk page allocation interface to reduce the overhead of allocating large numbers of pages in a loop. The first two patches are refactoring buffer memory allocation and converting the uncached buffer path to use the same page allocation path, followed by converting the page allocation path to use bulk allocation. The rest of the patches are then consolidation of the page allocation and freeing code to simplify the code and remove a chunk of unnecessary abstraction. This is largely based on a series of changes made by Christoph Hellwig. * tag 'xfs-buf-bulk-alloc-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: xfs: merge xfs_buf_allocate_memory xfs: cleanup error handling in xfs_buf_get_map xfs: get rid of xb_to_gfp() xfs: simplify the b_page_count calculation xfs: remove ->b_offset handling for page backed buffers xfs: move page freeing into _xfs_buf_free_pages() xfs: merge _xfs_buf_get_pages() xfs: use alloc_pages_bulk_array() for buffers xfs: use xfs_buf_alloc_pages for uncached buffers xfs: split up xfs_buf_allocate_memory
-
- 07 Jun, 2021 5 commits
-
-
Dave Chinner authored
It only has one caller and is now a simple function, so merge it into the caller. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-
Christoph Hellwig authored
Use a single goto label for freeing the buffer and returning an error. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <dchinner@redhat.com>
-
Dave Chinner authored
Only used in one place, so just open code the logic in the macro. Based on a patch from Christoph Hellwig. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-
Christoph Hellwig authored
Ever since we stopped using the Linux page cache to back XFS buffers there is no need to take the start sector into account for calculating the number of pages in a buffer, as the data always start from the beginning of the buffer. Signed-off-by: Christoph Hellwig <hch@lst.de> [dgc: modified to suit this series] Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-
Christoph Hellwig authored
->b_offset can only be non-zero for _XBF_KMEM backed buffers, so remove all code dealing with it for page backed buffers. Signed-off-by: Christoph Hellwig <hch@lst.de> [dgc: modified to fit this patchset] Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-
- 02 Jun, 2021 27 commits
-
-
Dave Chinner authored
Because this happens at high thread counts on high IOPS devices doing mixed read/write AIO-DIO to a single file at about a million iops: 64.09% 0.21% [kernel] [k] io_submit_one - 63.87% io_submit_one - 44.33% aio_write - 42.70% xfs_file_write_iter - 41.32% xfs_file_dio_write_aligned - 25.51% xfs_file_write_checks - 21.60% _raw_spin_lock - 21.59% do_raw_spin_lock - 19.70% __pv_queued_spin_lock_slowpath This also happens of the IO completion IO path: 22.89% 0.69% [kernel] [k] xfs_dio_write_end_io - 22.49% xfs_dio_write_end_io - 21.79% _raw_spin_lock - 20.97% do_raw_spin_lock - 20.10% __pv_queued_spin_lock_slowpath IOWs, fio is burning ~14 whole CPUs on this spin lock. So, do an unlocked check against inode size first, then if we are at/beyond EOF, take the spinlock and recheck. This makes the spinlock disappear from the overwrite fastpath. I'd like to report that fixing this makes things go faster. It doesn't - it just exposes the the XFS_ILOCK as the next severe contention point doing extent mapping lookups, and that now burns all the 14 CPUs this spinlock was burning. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
-
Christoph Hellwig authored
xfs_bmap_set_attrforkoff is only used inside of xfs_bmap.c, so mark it static. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
-
Jiapeng Chong authored
Variable busy is set to false, but this value is never read as it is overwritten or not used later on, hence it is a redundant assignment and can be removed. Clean up the following clang-analyzer warning: fs/xfs/libxfs/xfs_alloc.c:1679:2: warning: Value stored to 'busy' is never read [clang-analyzer-deadcode.DeadStores]. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
-
Shaokun Zhang authored
Variable 'xfs_agf_buf_ops', 'xfs_agi_buf_ops', 'xfs_dquot_buf_ops' and 'xfs_symlink_buf_ops' are declared twice, so sort these variables alphabetically and remove the repeated declaration. Cc: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
-
Dave Chinner authored
Almost unused, gets rid of another typedef. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-
Dave Chinner authored
Unlinked lists are held in the perag, and freeing of inodes needs to be passed a perag, too, so look up the perag early in the unlink processing and use it throughout. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Brian Foster <bfoster@redhat.com>
-
Dave Chinner authored
Because it's a mess. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-
Dave Chinner authored
Now that we've internalised the two-phase inode allocation, we can now easily make the AG selection and allocation atomic from the perspective of a single perag context. This will ensure AGs going offline/away cannot occur between the selection and allocation steps. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-
Dave Chinner authored
This is just a simple wrapper around the per-ag inode allocation that doesn't need to exist. The internal mechanism to select and allocate within an AG does not need to be exposed outside xfs_ialloc.c, and it being exposed simply makes it harder to follow the code and simplify it. This is simplified by internalising xf_dialloc_select_ag() and xfs_dialloc_ag() into a single xfs_dialloc() function and then xfs_dir_ialloc() can go away. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-
Dave Chinner authored
xfs_dialloc_select_ag() does a lot of repetitive work. It first calls xfs_ialloc_ag_select() to select the AG to start allocation attempts in, which can do up to two entire loops across the perags that inodes can be allocated in. This is simply checking if there is spce available to allocate inodes in an AG, and it returns when it finds the first candidate AG. xfs_dialloc_select_ag() then does it's own iterative walk across all the perags locking the AGIs and trying to allocate inodes from the locked AG. It also doesn't limit the search to mp->m_maxagi, so it will walk all AGs whether they can allocate inodes or not. Hence if we are really low on inodes, we could do almost 3 entire walks across the whole perag range before we find an allocation group we can allocate inodes in or report ENOSPC. Because xfs_ialloc_ag_select() returns on the first candidate AG it finds, we can simply do these checks directly in xfs_dialloc_select_ag() before we lock and try to allocate inodes. This reduces the inode allocation pass down to 2 perag sweeps at most - one for aligned inode cluster allocation and if we can't allocate full, aligned inode clusters anywhere we'll do another pass trying to do sparse inode cluster allocation. This also removes a big chunk of duplicate code. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-
Dave Chinner authored
The only caller of xfs_dialloc_select_ag() will always return -ENOSPC to it's caller if the agbp returned from xfs_dialloc_select_ag() is NULL. IOWs, failure to find a candidate AGI we can allocate inodes from is always an ENOSPC condition, so move this logic up into xfs_dialloc_select_ag() so we can simplify the return logic in this function. xfs_dialloc_select_ag() now only ever returns 0 with a locked agbp, or an error with no agbp. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-
Dave Chinner authored
Now that everything passes a perag, the agno is not needed anymore. Convert all the users to use pag->pag_agno instead and remove the agno from the cursor. This was largely done as an automated search and replace. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-
Dave Chinner authored
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-
Dave Chinner authored
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-
Dave Chinner authored
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-
Dave Chinner authored
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-
Dave Chinner authored
Which will eventually completely replace the agno in it. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Brian Foster <bfoster@redhat.com>
-
Dave Chinner authored
Needs a [from, to] ranged AG walk, and the perag to be stuffed into the info structure for callouts to use. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-
Dave Chinner authored
We currently pass an agno from the AG reservation functions to the individual feature accounting functions, which in future may have to do perag lookups to access per-AG state. Instead, pre-emptively plumb the perag through from the highest AG reservation layer to the feature callouts so they won't have to look it up again. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Brian Foster <bfoster@redhat.com>
-
Dave Chinner authored
All of the callers of the busy extent API either have perag references available to use so we can pass a perag to the busy extent functions rather than having them have to do unnecessary lookups. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-
Dave Chinner authored
Clean up the last external manual AG walk. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-
Dave Chinner authored
Rather than manually walking the ags and passing agnunbers around, pass the perag for the AG we are currently working on around in the iwalk structure. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-
Dave Chinner authored
Convert the raw walks to an iterator, pulling the current AG out of pag->pag_agno instead of the loop iterator variable. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-
Dave Chinner authored
for_each_perag_tag() is defined in xfs_icache.c for local use. Promote this to xfs_ag.h and define equivalent iteration functions so that we can use them to iterate AGs instead to replace open coded perag walks and perag lookups. We also convert as many of the straight forward open coded AG walks to use these iterators as possible. Anything that is not a direct conversion to an iterator is ignored and will be updated in future commits. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-
Dave Chinner authored
Move the xfs_perag infrastructure to the libxfs files that contain all the per AG infrastructure. This helps set up for passing perags around all the code instead of bare agnos with minimal extra includes for existing files. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-
Dave Chinner authored
The perag structures really need to be defined with the rest of the AG support infrastructure. The struct xfs_perag and init/teardown has been placed in xfs_mount.[ch] because there are differences in the structure between kernel and userspace. Mainly that userspace doesn't have a lot of the internal stuff that the kernel has for caches and discard and other such structures. However, it makes more sense to move this to libxfs than to keep this separation because we are now moving to use struct perags everywhere in the code instead of passing raw agnumber_t values about. Hence we shoudl really move the support infrastructure to libxfs/xfs_ag.[ch]. To do this without breaking userspace, first we need to rearrange the structures and code so that all the kernel specific code is located together. This makes it simple for userspace to ifdef out the all the parts it does not need, minimising the code differences between kernel and userspace. The next commit will do the move... Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-
Dave Chinner authored
They are AG functions, not superblock functions, so move them to the appropriate location. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-
- 01 Jun, 2021 3 commits
-
-
Darrick J. Wong authored
The superblock verifier already validates that (1 << blocklog) == blocksize, so use the value directly instead of doing math. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
-
Darrick J. Wong authored
Replace some open-coded fs block unit conversions with the standard conversion macro. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
-
Dave Chinner authored
Rather than open coding it just before we call _xfs_buf_free_pages(). Also, rename the function to xfs_buf_free_pages() as the leading underscore has no useful meaning. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
-