An error occurred fetching the project authors.
- 19 May, 2010 7 commits
-
-
Dave Chinner authored
The transaction ID that is written to the log for a transaction is currently set by taking the lower 32 bits of the memory address of the ticket structure. This is not guaranteed to be unique as tickets comes from a slab and slots can be reallocated immediately after being freed. As a result, there is no guarantee of uniqueness in the ticket ID value. Fix this by assigning a random number to the ticket ID field so that it is extremely unlikely that duplicates will occur and remove the possibility of transactions being mixed up during recovery due to duplicate IDs. Signed-off-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Christoph Hellwig authored
Replace the awkward xlog_write_adv_cnt with an inline helper that makes it more obvious that it's modifying it's paramters, and replace the use of an integer type for "ptr" with a real void pointer. Also move xlog_write_adv_cnt to xfs_log_priv.h as it will be used outside of xfs_log.c in the delayed logging series. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Dave Chinner <dchinner@redhat.com>
-
Dave Chinner authored
The current log IO vector structure is a flat array and not extensible. To make it possible to keep separate log IO vectors for individual log items, we need a method of chaining log IO vectors together. Introduce a new log vector type that can be used to wrap the existing log IO vectors on use that internally to the log. This means that the existing external interface (xfs_log_write) does not change and hence no changes to the transaction commit code are required. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Dave Chinner <dchinner@redhat.com>
-
Christoph Hellwig authored
Reindent xlog_write to normal one tab indents and move all variable declarations into the closest enclosing block. Split from a bigger patch by Dave Chinner. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Dave Chinner <dchinner@redhat.com>
-
Dave Chinner authored
xlog_write is a mess that takes a lot of effort to understand. It is a mass of nested loops with 4 space indents to get it to fit in 80 columns and lots of funky variables that aren't obvious what they mean or do. Break it down into understandable chunks. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Dave Chinner <dchinner@redhat.com>
-
Dave Chinner authored
When allocation a ticket for a transaction, the ticket is initialised with the worst case log space usage based on the number of bytes the transaction may consume. Part of this calculation is the number of log headers required for the iclog space used up by the transaction. This calculation makes an undocumented assumption that if the transaction uses the log header space reservation on an iclog, then it consumes either the entire iclog or it completes. That is - the transaction that is first in an iclog is the transaction that the log header reservation is accounted to. If the transaction is larger than the iclog, then it will use the entire iclog itself. Document this assumption. Further, the current calculation uses the rule that we can fit iclog_size bytes of transaction data into an iclog. This is in correct - the amount of space available in an iclog for transaction data is the size of the iclog minus the space used for log record headers. This means that the calculation is out by 512 bytes per 32k of log space the transaction can consume. This is rarely an issue because maximally sized transactions are extremely uncommon, and for 4k block size filesystems maximal transaction reservations are about 400kb. Hence the error in this case is less than the size of an iclog, so that makes it even harder to hit. However, anyone using larger directory blocks (16k directory blocks push the maximum transaction size to approx. 900k on a 4k block size filesystem) or larger block size (e.g. 64k blocks push transactions to the 3-4MB size) could see the error grow to more than an iclog and at this point the transaction is guaranteed to get a reservation underrun and shutdown the filesystem. Fix this by adjusting the calculation to calculate the correct number of iclogs required and account for them all up front. Signed-off-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Dave Chinner authored
Each log item type does manual initialisation of the log item. Delayed logging introduces new fields that need initialisation, so factor all the open coded initialisation into a common function first. Signed-off-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
- 16 Apr, 2010 1 commit
-
-
Dave Chinner authored
Updates to the VFS layer removed an extra ->sync_fs call into the filesystem during the sync process (from the quota code). Unfortunately the sync code was unknowingly relying on this call to make sure metadata buffers were flushed via a xfs_buftarg_flush() call to move the tail of the log forward in memory before the final transactions of the sync process were issued. As a result, the old code would write a very recent log tail value to the log by the end of the sync process, and so a subsequent crash would leave nothing for log recovery to do. Hence in qa test 182, log recovery only replayed a small handle for inode fsync transactions in this case. However, with the removal of the extra ->sync_fs call, the log tail was now not moved forward with the inode fsync transactions near the end of the sync procese the first (and only) buftarg flush occurred after these transactions went to disk. The result is that log recovery now sees a large number of transactions for metadata that is already on disk. This usually isn't a problem, but when the transactions include inode chunk allocation, the inode create transactions and all subsequent changes are replayed as we cannt rely on what is on disk is valid. As a result, if the inode was written and contains unlogged changes, the unlogged changes are lost, thereby violating sync semantics. The fix is to always issue a transaction after the buftarg flush occurs is the log iѕ not idle or covered. This results in a dummy transaction being written that contains the up-to-date log tail value, which will be very recent. Indeed, it will be at least as recent as the old code would have left on disk, so log recovery will behave exactly as it used to in this situation. Signed-off-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Alex Elder <aelder@sgi.com>
-
- 01 Mar, 2010 1 commit
-
-
Christoph Hellwig authored
Currenly we pass opaque xfs_log_ticket_t handles instead of struct xlog_ticket pointers, and void pointers instead of struct xlog_in_core pointers to various log manager functions. Instead pass properly typed pointers after adding forward declarations for them to xfs_log.h, and adjust the touched function prototypes to the standard XFS style while at it. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Dave Chinner <david@fromorbit.com> Signed-off-by:
Alex Elder <aelder@sgi.com>
-
- 21 Jan, 2010 2 commits
-
-
Christoph Hellwig authored
Remove the XFS_LOG_FORCE argument which was always set, and the XFS_LOG_URGE define, which was never used. Split xfs_log_force into a two helpers - xfs_log_force which forces the whole log, and xfs_log_force_lsn which forces up to the specified LSN. The underlying implementations already were entirely separate, as were the users. Also re-indent the new _xfs_log_force/_xfs_log_force which previously had a weird coding style. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Alex Elder <aelder@sgi.com>
-
Christoph Hellwig authored
This macro only obsfucates the log item type assignments, so kill it. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Dave Chinner <david@fromorbit.com> Signed-off-by:
Alex Elder <aelder@sgi.com>
-
- 15 Jan, 2010 1 commit
-
-
Christoph Hellwig authored
Don't bother using XFS_bwrite as it doesn't provide much code for our use case. Instead opencode it and fold xlog_bdstrat_cb into the new xlog_bdstrat helper. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Alex Elder <aelder@sgi.com>
-
- 16 Dec, 2009 1 commit
-
-
Dave Chinner authored
Change all async metadata buffers to use [READ|WRITE]_META I/O types so that the I/O doesn't get issued immediately. This allows merging of adjacent metadata requests but still prioritises them over bulk data. This shows a 10-15% improvement in sequential create speed of small files. Don't include the log buffers in this classification - leave them as sync types so they are issued immediately. Signed-off-by:
Dave Chinner <dgc@sgi.com> Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Alex Elder <aelder@sgi.com>
-
- 15 Dec, 2009 1 commit
-
-
Christoph Hellwig authored
Convert the old xfs tracing support that could only be used with the out of tree kdb and xfsidbg patches to use the generic event tracer. To use it make sure CONFIG_EVENT_TRACING is enabled and then enable all xfs trace channels by: echo 1 > /sys/kernel/debug/tracing/events/xfs/enable or alternatively enable single events by just doing the same in one event subdirectory, e.g. echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_ihold/enable or set more complex filters, etc. In Documentation/trace/events.txt all this is desctribed in more detail. To reads the events do a cat /sys/kernel/debug/tracing/trace Compared to the last posting this patch converts the tracing mostly to the one tracepoint per callsite model that other users of the new tracing facility also employ. This allows a very fine-grained control of the tracing, a cleaner output of the traces and also enables the perf tool to use each tracepoint as a virtual performance counter, allowing us to e.g. count how often certain workloads git various spots in XFS. Take a look at http://lwn.net/Articles/346470/ for some examples. Also the btree tracing isn't included at all yet, as it will require additional core tracing features not in mainline yet, I plan to deliver it later. And the really nice thing about this patch is that it actually removes many lines of code while adding this nice functionality: fs/xfs/Makefile | 8 fs/xfs/linux-2.6/xfs_acl.c | 1 fs/xfs/linux-2.6/xfs_aops.c | 52 - fs/xfs/linux-2.6/xfs_aops.h | 2 fs/xfs/linux-2.6/xfs_buf.c | 117 +-- fs/xfs/linux-2.6/xfs_buf.h | 33 fs/xfs/linux-2.6/xfs_fs_subr.c | 3 fs/xfs/linux-2.6/xfs_ioctl.c | 1 fs/xfs/linux-2.6/xfs_ioctl32.c | 1 fs/xfs/linux-2.6/xfs_iops.c | 1 fs/xfs/linux-2.6/xfs_linux.h | 1 fs/xfs/linux-2.6/xfs_lrw.c | 87 -- fs/xfs/linux-2.6/xfs_lrw.h | 45 - fs/xfs/linux-2.6/xfs_super.c | 104 --- fs/xfs/linux-2.6/xfs_super.h | 7 fs/xfs/linux-2.6/xfs_sync.c | 1 fs/xfs/linux-2.6/xfs_trace.c | 75 ++ fs/xfs/linux-2.6/xfs_trace.h | 1369 +++++++++++++++++++++++++++++++++++++++++ fs/xfs/linux-2.6/xfs_vnode.h | 4 fs/xfs/quota/xfs_dquot.c | 110 --- fs/xfs/quota/xfs_dquot.h | 21 fs/xfs/quota/xfs_qm.c | 40 - fs/xfs/quota/xfs_qm_syscalls.c | 4 fs/xfs/support/ktrace.c | 323 --------- fs/xfs/support/ktrace.h | 85 -- fs/xfs/xfs.h | 16 fs/xfs/xfs_ag.h | 14 fs/xfs/xfs_alloc.c | 230 +----- fs/xfs/xfs_alloc.h | 27 fs/xfs/xfs_alloc_btree.c | 1 fs/xfs/xfs_attr.c | 107 --- fs/xfs/xfs_attr.h | 10 fs/xfs/xfs_attr_leaf.c | 14 fs/xfs/xfs_attr_sf.h | 40 - fs/xfs/xfs_bmap.c | 507 +++------------ fs/xfs/xfs_bmap.h | 49 - fs/xfs/xfs_bmap_btree.c | 6 fs/xfs/xfs_btree.c | 5 fs/xfs/xfs_btree_trace.h | 17 fs/xfs/xfs_buf_item.c | 87 -- fs/xfs/xfs_buf_item.h | 20 fs/xfs/xfs_da_btree.c | 3 fs/xfs/xfs_da_btree.h | 7 fs/xfs/xfs_dfrag.c | 2 fs/xfs/xfs_dir2.c | 8 fs/xfs/xfs_dir2_block.c | 20 fs/xfs/xfs_dir2_leaf.c | 21 fs/xfs/xfs_dir2_node.c | 27 fs/xfs/xfs_dir2_sf.c | 26 fs/xfs/xfs_dir2_trace.c | 216 ------ fs/xfs/xfs_dir2_trace.h | 72 -- fs/xfs/xfs_filestream.c | 8 fs/xfs/xfs_fsops.c | 2 fs/xfs/xfs_iget.c | 111 --- fs/xfs/xfs_inode.c | 67 -- fs/xfs/xfs_inode.h | 76 -- fs/xfs/xfs_inode_item.c | 5 fs/xfs/xfs_iomap.c | 85 -- fs/xfs/xfs_iomap.h | 8 fs/xfs/xfs_log.c | 181 +---- fs/xfs/xfs_log_priv.h | 20 fs/xfs/xfs_log_recover.c | 1 fs/xfs/xfs_mount.c | 2 fs/xfs/xfs_quota.h | 8 fs/xfs/xfs_rename.c | 1 fs/xfs/xfs_rtalloc.c | 1 fs/xfs/xfs_rw.c | 3 fs/xfs/xfs_trans.h | 47 + fs/xfs/xfs_trans_buf.c | 62 - fs/xfs/xfs_vnodeops.c | 8 70 files changed, 2151 insertions(+), 2592 deletions(-) Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Alex Elder <aelder@sgi.com>
-
- 12 Aug, 2009 1 commit
-
-
Christoph Hellwig authored
Without SMP or preemption spin_is_locked always returns false, so we can't do an assert with it. Instead use assert_spin_locked, which does the right thing on all builds. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Eric Sandeen <sandeen@sandeen.net> Reported-by:
Johannes Engel <jcnengel@googlemail.com> Tested-by:
Johannes Engel <jcnengel@googlemail.com> Signed-off-by:
Felix Blyakher <felixb@sgi.com>
-
- 11 Aug, 2009 1 commit
-
-
Christoph Hellwig authored
Without SMP or preemption spin_is_locked always returns false, so we can't do an assert with it. Instead use assert_spin_locked, which does the right thing on all builds. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Eric Sandeen <sandeen@sandeen.net> Reported-by:
Johannes Engel <jcnengel@googlemail.com> Tested-by:
Johannes Engel <jcnengel@googlemail.com> Signed-off-by:
Felix Blyakher <felixb@sgi.com>
-
- 06 Apr, 2009 2 commits
-
-
Dave Chinner authored
When trying to reserve log space, we find the amount of space we need, then go to sleep waiting for space. When we are woken, we try to push the tail of the log forward to make sure we have space available. Unfortunately, this means that if there is not space available, and everyone who needs space goes to sleep there is no-one left to push the tail of the log to make space available. Once we have a thread waiting for space to become available, the others queue up behind it in a FIFO, and none of them push the tail of the log. This can result in everyone going to sleep in xlog_grant_log_space() if the first sleeper races with the last I/O that moves the tail of the log forward. With no further I/O tomove the tail of the log, there is nothing to wake the sleepers and hence all transactions just stop. Fix this by making sure the xfsaild will create enough space for the transaction that is about to sleep by moving the push target far enough forwards to ensure that that the curent proceeees will have enough space available when it is woken. That is, we push the AIL before we go to sleep. Because we've inserted the log ticket into the queue before we've pushed and gone to sleep, subsequent transactions will wait behind this one. Hence we are guaranteed to have space available when we are woken. Signed-off-by:
Dave Chinner <david@fromorbit.com> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Dave Chinner authored
If the large log sector size feature bit is set in the superblock by accident (say disk corruption), the then fields that are now considered valid are not checked on production kernels. The checks are present as ASSERT statements so cause a panic on a debug kernel. Change this so that the fields are validity checked if the feature bit is set and abort the log mount if the fields do not contain valid values. Reported-by:
Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by:
Dave Chinner <david@fromorbit.com> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
- 29 Mar, 2009 1 commit
-
-
Malcolm Parsons authored
Signed-off-by:
Malcolm Parsons <malcolm.parsons@gmail.com> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
- 16 Mar, 2009 1 commit
-
-
Christoph Hellwig authored
Kill the current xfs_log_unmount wrapper and opencode the two function calls in the only caller. Rename the current xfs_log_unmount_dealloc to xfs_log_unmount as it undoes xfs_log_mount and the new name makes that more clear. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Dave Chinner <david@fromorbit.com>
-
- 12 Feb, 2009 1 commit
-
-
Christoph Hellwig authored
We can't just call xfs_log_unmount_dealloc on any failure because the ail thread which is torn down by xfs_log_unmount_dealloc might not be initialized yet. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Felix Blyakher <felixb@sgi.com> Reported-by:
Lachlan McIlroy <lachlan@sgi.com>
-
- 09 Feb, 2009 1 commit
-
-
Christoph Hellwig authored
Our default has been to always use 8 32KB log buffers for a while now, so remove the special casing for larger block size filesystem to use the same or even lower number of buffers. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Dave Chinner <david@fromorbit.com>
-
- 04 Dec, 2008 1 commit
-
-
Christoph Hellwig authored
All but one caller of xlog_state_want_sync drop and re-acquire l_icloglock around the call to it, just so that xlog_state_want_sync can acquire and drop it. Move all lock operation out of l_icloglock and assert that the lock is held when it is called. Note that it would make sense to extende this scheme to xlog_state_release_iclog, but the locking in there is more complicated and we'd like to keep the atomic_dec_and_lock optmization for those callers not having l_icloglock yet. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Dave Chinner <david@fromorbit.com> Signed-off-by:
Niv Sardi <xaiki@sgi.com>
-
- 01 Dec, 2008 2 commits
-
-
Christoph Hellwig authored
Move all fields from xlog_iclog_fields_t into xlog_in_core_t instead of having them in a substructure and the using #defines to make it look like they were directly in xlog_in_core_t. Also document that xlog_in_core_2_t is grossly misnamed, and make all references to it typesafe. (First sent on Semptember 15th) Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Dave Chinner <david@fromorbit.com> Signed-off-by:
Niv Sardi <xaiki@sgi.com>
-
Christoph Hellwig authored
xfs_log_force_umount may be called very early during log recovery where If we fail a buffer read in xlog_recover_do_inode_trans we abort the mount. But at that point log recovery has started delayed writeback of inode buffers. As part of the aborted mount we try to flush out all delwri buffers, but at that point we have already freed the superblock, and set mp->m_sb_bp to NULL, and xfs_log_force_umount which gets called after the inode buffer writeback trips over it. Make xfs_log_force_umount a little more careful when accessing mp->m_sb_bp to avoid this. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Eric Sandeen <sandeen@sandeen.net> Signed-off-by:
Niv Sardi <xaiki@sgi.com>
-
- 17 Nov, 2008 1 commit
-
-
Dave Chinner authored
When an I/O error occurs during an intermediate commit on a rolling transaction, xfs_trans_commit() will free the transaction structure and the related ticket. However, the duplicate transaction that gets used as the transaction continues still contains a pointer to the ticket. Hence when the duplicate transaction is cancelled and freed, we free the ticket a second time. Add reference counting to the ticket so that we hold an extra reference to the ticket over the transaction commit. We drop the extra reference once we have checked that the transaction commit did not return an error, thus avoiding a double free on commit error. Credit to Nick Piggin for tripping over the problem. SGI-PV: 989741 Signed-off-by:
Dave Chinner <david@fromorbit.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Lachlan McIlroy <lachlan@sgi.com>
-
- 10 Nov, 2008 2 commits
-
-
Dave Chinner authored
When there is no memory left in the system, xfs_buf_get_noaddr() can fail. If this happens at mount time during xlog_alloc_log() we fail to catch the error and oops. Catch the error from xfs_buf_get_noaddr(), and allow other memory allocations to fail and catch those errors too. Report the error to the console and fail the mount with ENOMEM. Tested by manually injecting errors into xfs_buf_get_noaddr() and xlog_alloc_log(). Version 2: o remove unnecessary casts of the returned pointer from kmem_zalloc() SGI-PV: 987246 Signed-off-by:
Dave Chinner <david@fromorbit.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Lachlan McIlroy <lachlan@sgi.com>
-
Dave Chinner authored
When there is no memory left in the system, xfs_buf_get_noaddr() can fail. If this happens at mount time during xlog_alloc_log() we fail to catch the error and oops. Catch the error from xfs_buf_get_noaddr(), and allow other memory allocations to fail and catch those errors too. Report the error to the console and fail the mount with ENOMEM. Tested by manually injecting errors into xfs_buf_get_noaddr() and xlog_alloc_log(). Version 2: o remove unnecessary casts of the returned pointer from kmem_zalloc() SGI-PV: 987246 Signed-off-by:
Dave Chinner <david@fromorbit.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Lachlan McIlroy <lachlan@sgi.com>
-
- 30 Oct, 2008 5 commits
-
-
David Chinner authored
Change all the remaining AIL API functions that are passed struct xfs_mount pointers to pass pointers directly to the struct xfs_ail being used. With this conversion, all external access to the AIL is via the struct xfs_ail. Hence the operation and referencing of the AIL is almost entirely independent of the xfs_mount that is using it - it is now much more tightly tied to the log and the items it is tracking in the log than it is tied to the xfs_mount. SGI-PV: 988143 SGI-Modid: xfs-linux-melb:xfs-kern:32353a Signed-off-by:
David Chinner <david@fromorbit.com> Signed-off-by:
Lachlan McIlroy <lachlan@sgi.com> Signed-off-by:
Christoph Hellwig <hch@infradead.org>
-
David Chinner authored
When we need to go from the log to the AIL, we have to go via the xfs_mount. Add a xfs_ail pointer to the log so we can go directly to the AIL associated with the log. SGI-PV: 988143 SGI-Modid: xfs-linux-melb:xfs-kern:32351a Signed-off-by:
David Chinner <david@fromorbit.com> Signed-off-by:
Lachlan McIlroy <lachlan@sgi.com> Signed-off-by:
Christoph Hellwig <hch@infradead.org>
-
David Chinner authored
Bring the ail lock inside the struct xfs_ail. This means the AIL can be entirely manipulated via the struct xfs_ail rather than needing both the struct xfs_mount and the struct xfs_ail. SGI-PV: 988143 SGI-Modid: xfs-linux-melb:xfs-kern:32350a Signed-off-by:
David Chinner <david@fromorbit.com> Signed-off-by:
Lachlan McIlroy <lachlan@sgi.com> Signed-off-by:
Christoph Hellwig <hch@infradead.org>
-
David Chinner authored
With the new cursor interface, it makes sense to make all the traversing code use the cursor interface and make the old one go away. This means more of the AIL interfacing is done by passing struct xfs_ail pointers around the place instead of struct xfs_mount pointers. We can replace the use of xfs_trans_first_ail() in xfs_log_need_covered() as it is only checking if the AIL is empty. We can do that with a call to xfs_trans_ail_tail() instead, where a zero LSN returned indicates and empty AIL... SGI-PV: 988143 SGI-Modid: xfs-linux-melb:xfs-kern:32348a Signed-off-by:
David Chinner <david@fromorbit.com> Signed-off-by:
Lachlan McIlroy <lachlan@sgi.com> Signed-off-by:
Christoph Hellwig <hch@infradead.org>
-
David Chinner authored
To replace the current generation number ensuring sanity of the AIL traversal, replace it with an external cursor that is linked to the AIL. Basically, we store the next item in the cursor whenever we want to drop the AIL lock to do something to the current item. When we regain the lock. the current item may already be free, so we can't reference it, but the next item in the traversal is already held in the cursor. When we move or delete an object, we search all the active cursors and if there is an item match we clear the cursor(s) that point to the object. This forces the traversal to restart transparently. We don't invalidate the cursor on insert because the cursor still points to a valid item. If the intem is inserted between the current item and the cursor it does not matter; the traversal is considered to be past the insertion point so it will be picked up in the next traversal. Hence traversal restarts pretty much disappear altogether with this method of traversal, which should substantially reduce the overhead of pushing on a busy AIL. Version 2 o add restart logic o comment cursor interface o minor cleanups SGI-PV: 988143 SGI-Modid: xfs-linux-melb:xfs-kern:32347a Signed-off-by:
David Chinner <david@fromorbit.com> Signed-off-by:
Lachlan McIlroy <lachlan@sgi.com> Signed-off-by:
Christoph Hellwig <hch@infradead.org>
-
- 10 Oct, 2008 1 commit
-
-
Christoph Hellwig authored
Currently we disable barriers as soon as we get a buffer in xlog_iodone that has the XBF_ORDERED flag cleared. But this can be the case not only for buffers where the barrier failed, but also the first buffer of a split log write in case of a log wraparound. Due to the disabled barriers we can easily get directory corruption on unclean shutdowns. So instead of using this check add a new buffer flag for failed barrier writes. This is a regression vs 2.6.26 caused by patch to use the right macro to check for the ORDERED flag, as we previously got true returned for every buffer. Thanks to Toei Rei for reporting the bug. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Eric Sandeen <sandeen@sandeen.net> Reviewed-by:
David Chinner <david@fromorbit.com> Signed-off-by:
Tim Shimmin <tes@sgi.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 17 Sep, 2008 2 commits
-
-
David Chinner authored
The current code in xlog_iodone() uses the wrong macro to check if the barrier has been cleared due to an EOPNOTSUPP error form the lower layer. SGI-PV: 986143 SGI-Modid: xfs-linux-melb:xfs-kern:31984a Signed-off-by:
David Chinner <david@fromorbit.com> Signed-off-by:
Nathaniel W. Turner <nate@houseofnate.net> Signed-off-by:
Peter Leckie <pleckie@sgi.com> Signed-off-by:
Lachlan McIlroy <lachlan@sgi.com>
-
Lachlan McIlroy authored
Memory allocations for log->l_grant_trace and iclog->ic_trace are done on demand when the first event is logged. In xlog_state_get_iclog_space() we call xlog_trace_iclog() under a spinlock and allocating memory here can cause us to sleep with a spinlock held and deadlock the system. For the log grant tracing we use KM_NOSLEEP but that means we can lose trace entries. Since there is no locking to serialize the log grant tracing we could race and have multiple allocations and leak memory. So move the allocations to where we initialize the log/iclog structures. Use KM_NOFS to avoid recursing into the filesystem and drop log->l_trace since it's not even used. SGI-PV: 983738 SGI-Modid: xfs-linux-melb:xfs-kern:31896a Signed-off-by:
Lachlan McIlroy <lachlan@sgi.com> Signed-off-by:
Christoph Hellwig <hch@infradead.org>
-
- 13 Aug, 2008 4 commits
-
-
Lachlan McIlroy authored
The ticket allocation code got reworked in 2.6.26 and we now free tickets whereas before we used to cache them so the use-after-free went undetected. SGI-PV: 985525 SGI-Modid: xfs-linux-melb:xfs-kern:31877a Signed-off-by:
Lachlan McIlroy <lachlan@sgi.com> Signed-off-by:
David Chinner <david@fromorbit.com>
-
Lachlan McIlroy authored
Use KM_NOFS to prevent recursion back into the filesystem which can cause deadlocks. In the case of xfs_iread() we hold the lock on the inode cluster buffer while allocating memory for the trace buffers. If we recurse back into XFS to flush data that may require a transaction to allocate extents which needs log space. This can deadlock with the xfsaild thread which can't push the tail of the log because it is trying to get the inode cluster buffer lock. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31838a Signed-off-by:
Lachlan McIlroy <lachlan@sgi.com> Signed-off-by:
David Chinner <david@fromorbit.com>
-
Christoph Hellwig authored
Remove all the useless flags and code keyed off it in xfs_mountfs. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31831a Signed-off-by:
Christoph Hellwig <hch@infradead.org> Signed-off-by:
Lachlan McIlroy <lachlan@sgi.com>
-
David Chinner authored
A lot of code has been converted away from semaphores, but there are still comments that reference semaphore behaviour. The log code is the worst offender. Update the comments to reflect what the code really does now. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31814a Signed-off-by:
David Chinner <david@fromorbit.com> Signed-off-by:
Lachlan McIlroy <lachlan@sgi.com>
-