- 26 Apr, 2010 40 commits
-
-
Andreas Herrmann authored
commit 9d260ebc upstream. Use NodeId MSR to get NodeId and number of nodes per processor. Signed-off-by:
Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20091216144355.GB28798@alberich.amd.com> Signed-off-by:
H. Peter Anvin <hpa@zytor.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Daniel T Chen authored
commit b8e80cf3 upstream. BugLink: https://launchpad.net/bugs/551606 The OR's hardware distorts at PCM 100% because it does not correspond to 0 dB. Fix this in patch_ad1981() for all models using the Thinkpad quirk. Reported-by: Jane Silber Signed-off-by:
Daniel T Chen <crimsun@ubuntu.com> Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dan Carpenter authored
commit b0cc58a2 upstream. The original code doesn't take into consideration that the value of MIXART_BA0_SIZE - pos can be less than zero which would lead to a large unsigned value for "count". Also I moved the check that read size is a multiple of 4 bytes below the code that adjusts "count". Signed-off-by:
Dan Carpenter <error27@gmail.com> Acked-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Wu Fengguang authored
commit 70655c06 upstream. btrfs relocate_file_extent_cluster() calls us with NULL filp: [ 4005.426805] BUG: unable to handle kernel NULL pointer dereference at 00000021 [ 4005.426818] IP: [<c109a130>] page_cache_sync_readahead+0x18/0x3e Signed-off-by:
Wu Fengguang <fengguang.wu@intel.com> Cc: Yan Zheng <yanzheng@21cn.com> Reported-by:
Kirill A. Shutemov <kirill@shutemov.name> Tested-by:
Kirill A. Shutemov <kirill@shutemov.name> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Anton Blanchard authored
commit 55ab3a1f upstream. Commit 148f948b (vfs: Introduce new helpers for syncing after writing to O_SYNC file or IS_SYNC inode) broke the raw driver. We now call through generic_file_aio_write -> generic_write_sync -> vfs_fsync_range. vfs_fsync_range has: if (!fop || !fop->fsync) { ret = -EINVAL; goto out; } But drivers/char/raw.c doesn't set an fsync method. We have two options: fix it or remove the raw driver completely. I'm happy to do either, the fact this has been broken for so long suggests it is rarely used. The patch below adds an fsync method to the raw driver. My knowledge of the block layer is pretty sketchy so this could do with a once over. If we instead decide to remove the raw driver, this patch might still be useful as a backport to 2.6.33 and 2.6.32. Signed-off-by:
Anton Blanchard <anton@samba.org> Reviewed-by:
Jan Kara <jack@suse.cz> Cc: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jens Axboe <jens.axboe@oracle.com> Reviewed-by:
Jeff Moyer <jmoyer@redhat.com> Tested-by:
Jeff Moyer <jmoyer@redhat.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Jiri Kosina authored
commit d8e4ebf8 upstream. Fix oops caused by dereferencing field->hidinput in cases where the device hasn't been claimed by hid-input. Reported-by:
Andreas Demmer <mail@andreas-demmer.de> Signed-off-by:
Jiri Kosina <jkosina@suse.cz> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Alan Cox authored
commit d6250a03 upstream. Making the new stuff work broke some of the old chipsets. We need to go back to the old set up values for these it seems. Unfortunately even with documentation this is basically a mix of cargoculting and guesswork. Chased down to the exact line by Gianluca. Signed-off-by:
Alan Cox <alan@linux.intel.com> Signed-off-by:
Jeff Garzik <jgarzik@redhat.com> Cc: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Éric Piel authored
commit 4b5d95b3 upstream. Originally the driver was only targeted to 12bits sensors. When support for 8bits sensors was added, some slight difference in the registers were overlooked. This should fix it, both for initialization, and for displaying the rate. Reported-by:
Kalhan Trisal <kalhan.trisal@intel.com> Reported-by:
Christoph Plattner <christoph.plattner@gmx.at> Tested-by:
Christoph Plattner <christoph.plattner@gmx.at> Tested-by:
Samu Onkalo <samu.p.onkalo@nokia.com> Signed-off-by:
Éric Piel <eric.piel@tremplin-utc.net> Signed-off-by:
Samu Onkalo <samu.p.onkalo@nokia.com> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Oleg Nesterov authored
commit 6da8d866 upstream. release_one_tty(tty) can be called when tty still has a reference to pgrp/session. In this case we leak the pid. Signed-off-by:
Oleg Nesterov <oleg@redhat.com> Reported-by:
Catalin Marinas <catalin.marinas@arm.com> Reported-and-tested-by:
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by:
Linus Torvalds <torvalds@linux-foundation.org> Acked-by:
Eric W. Biederman <ebiederm@xmission.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Thomas Gleixner authored
commit 753649db upstream. Network folks reported that directing all MSI-X vectors of their multi queue NICs to a single core can cause interrupt stack overflows when enough interrupts fire at the same time. This is caused by the fact that we run interrupt handlers by default with interrupts enabled unless the driver reuqests the interrupt with the IRQF_DISABLED set. The NIC handlers do not set this flag, so simultaneous interrupts can nest unlimited and cause the stack overflow. The only safe counter measure is to run the interrupt handlers with interrupts disabled. We can't switch to this mode in general right now, but it is safe to do so for MSI interrupts. Force IRQF_DISABLED for MSI interrupt handlers. Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <andi@firstfloor.org> Cc: Linus Torvalds <torvalds@osdl.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: David Miller <davem@davemloft.net> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Seth Heasley authored
commit 4c7d8492 upstream. This patch adds the Intel Cougar Point PCH LPC Controller DeviceIDs for iTCO Watchdog. Signed-off-by:
Seth Heasley <seth.heasley@intel.com> Signed-off-by:
Wim Van Sebroeck <wim@iguana.be> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Thomas Mingarelli authored
commit 8ba42bd8 upstream. [Novell Bug 581103] HP Watchdog driver has arbitrary (wrong) timeout limits. Fix the lower timeout limit to a more appropriate value. Signed-off-by:
Thomas Mingarelli <Thomas.Mingarelli@hp.com> Signed-off-by:
Wim Van Sebroeck <wim@iguana.be> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Wey-Yi Guy authored
commit 74e2bd1f upstream. When there is a need to restart/reconfig hw, tear down all the aggregation queues and let the mac80211 and driver get in-sync to have the opportunity to re-establish the aggregation queues again. Need to wait until driver re-establish all the station information before tear down the aggregation queues, driver(at least iwlwifi driver) will reject the stop aggregation queue request if station is not ready. But also need to make sure the aggregation queues are tear down before waking up the queues, so mac80211 will not sending frames with aggregation bit set. Signed-off-by:
Wey-Yi Guy <wey-yi.w.guy@intel.com> Signed-off-by:
John W. Linville <linville@tuxdriver.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Johannes Berg authored
commit 7236fe29 upstream. "mac80211: fix skb buffering issue" still left a race between enabling the hardware queues and the virtual interface queues. In hindsight it's totally obvious that enabling the netdev queues for a hardware queue when the hardware queue is enabled is wrong, because it could well possible that we can fill the hw queue with packets we already have pending. Thus, we must only enable the netdev queues once all the pending packets have been processed and sent off to the device. In testing, I haven't been able to trigger this race condition, but it's clearly there, possibly only when aggregation is being enabled. Signed-off-by:
Johannes Berg <johannes.berg@intel.com> Signed-off-by:
John W. Linville <linville@tuxdriver.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Valentin Longchamp authored
commit 2d20c72c upstream. An int urb is constructed but we fill it in with a bulk pipe type. Commit f661c6f8 implemented a pipe type check when CONFIG_USB_DEBUG is enabled. The check failed for all the ar9170 usb transfers and the driver could not configure the wifi dongle. This went unnoticed until now because most people don't have CONFIG_USB_DEBUG enabled. Signed-off-by:
Valentin Longchamp <valentin.longchamp@epfl.ch> Acked-by:
Christian Lamparter <chunkeey@googlemail.com> Signed-off-by:
John W. Linville <linville@tuxdriver.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dan Carpenter authored
commit 8e1a53c6 upstream. IWL_RATE_COUNT is 13 and IWL_RATE_COUNT_LEGACY is 12. IWL_RATE_COUNT_LEGACY is the right one here because iwl3945_rates doesn't support 60M and also that's how "rates" is defined in iwlcore_init_geos() from drivers/net/wireless/iwlwifi/iwl-core.c. rates = kzalloc((sizeof(struct ieee80211_rate) * IWL_RATE_COUNT_LEGACY), GFP_KERNEL); Signed-off-by:
Dan Carpenter <error27@gmail.com> Acked-by:
Zhu Yi <yi.zhu@intel.com> Signed-off-by:
John W. Linville <linville@tuxdriver.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Stanislaw Gruszka authored
During backporting of a120e912 ("iwlwifi: sanity check before counting number of tfds can be free") we forget one hunk, what make lot of messages "free more than tfds_in_queue" show up in dmesg. Signed-off-by:
Stanislaw Gruszka <sgruszka@redhat.com> Tested-by:
Adel Gadllah <adel.gadllah@gmail.com> (picked from https://patchwork.kernel.org/patch/86722/) Signed-off-by:
Stefan Bader <stefan.bader@canonical.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Wey-Yi Guy authored
commit be6b38bc upstream. Forget one hunk in 4965 during "iwlwifi: error checking for number of tfds in queue" patch. Reported-by:
Shanyu Zhao <shanyu.zhao@intel.com> Signed-off-by:
Wey-Yi Guy <wey-yi.w.guy@intel.com> Signed-off-by:
Reinette Chatre <reinette.chatre@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Matt Helsley authored
commit 5a7aadfe upstream. When the cgroup freezer is used to freeze tasks we do not want to thaw those tasks during resume. Currently we test the cgroup freezer state of the resuming tasks to see if the cgroup is FROZEN. If so then we don't thaw the task. However, the FREEZING state also indicates that the task should remain frozen. This also avoids a problem pointed out by Oren Ladaan: the freezer state transition from FREEZING to FROZEN is updated lazily when userspace reads or writes the freezer.state file in the cgroup filesystem. This means that resume will thaw tasks in cgroups which should be in the FROZEN state if there is no read/write of the freezer.state file to trigger this transition before suspend. NOTE: Another "simple" solution would be to always update the cgroup freezer state during resume. However it's a bad choice for several reasons: Updating the cgroup freezer state is somewhat expensive because it requires walking all the tasks in the cgroup and checking if they are each frozen. Worse, this could easily make resume run in N^2 time where N is the number of tasks in the cgroup. Finally, updating the freezer state from this code path requires trickier locking because of the way locks must be ordered. Instead of updating the freezer state we rely on the fact that lazy updates only manage the transition from FREEZING to FROZEN. We know that a cgroup with the FREEZING state may actually be FROZEN so test for that state too. This makes sense in the resume path even for partially-frozen cgroups -- those that really are FREEZING but not FROZEN. Reported-by:
Oren Ladaan <orenl@cs.columbia.edu> Signed-off-by:
Matt Helsley <matthltc@us.ibm.com> Signed-off-by:
Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Mike Christie authored
commit 4ae0a6c1 upstream. We could be failing/stopping a connection due to libiscsi starting recovery/cleanup, but the xmit path or scsi eh thread path could be dropping the connection at the same time. As a result the session->state gets set to failed instead of in recovery. We end up not blocking the session and so the replacement timeout never gets started and we only end up failing the IO when scsi_softirq_done sees that the cmd has been running for (cmd->allowed + 1) * rq->timeout secs. We used to fail the IO right away so users are seeing a long delay when using dm-multipath. This problem was added in 2.6.28. Signed-off-by:
Mike Christie <michaelc@cs.wisc.edu> Signed-off-by:
James Bottomley <James.Bottomley@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Andrew Stubbs authored
commit d5ab7803 upstream. Ensure that the aux table is properly initialized, even when optional features are missing. Without this, the FDPIC loader did not work. Signed-off-by:
Andrew Stubbs <ams@codesourcery.com> Signed-off-by:
Paul Mundt <lethal@linux-sh.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Matt Fleming authored
commit 4bea3418 upstream. For the boot, enable_mmu() is called from setup_arch() but we don't call setup_arch() for any of the other cpus. So turn on the non-boot cpu's mmu inside of start_secondary(). I noticed this bug on an SMP board when trying to map I/O memory (smsc911x registers) into the kernel address space. Since the Address Translation bit in MMUCR wasn't set, accessing the virtual address where the smsc911x registers were supposedly mapped actually performed a physical address access. Signed-off-by:
Matt Fleming <matt@console-pimps.org> Signed-off-by:
Paul Mundt <lethal@linux-sh.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Christoph Hellwig authored
commit f1f724e4 upstream The radix-tree code requires it's users to serialize tag updates against other updates to the tree. While XFS protects tag updates against each other it does not serialize them against updates of the tree contents, which can lead to tag corruption. Fix the inode cache to always take pag_ici_lock in exclusive mode when updating radix tree tags. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reported-by:
Patrick Schreurs <patrick@news-service.com> Tested-by:
Patrick Schreurs <patrick@news-service.com> Signed-off-by:
Alex Elder <aelder@sgi.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dave Chinner authored
commit 77d7a0c2 upstream The introduction of barriers to loop devices has created a new IO order completion dependency that XFS does not handle. The loop device implements barriers using fsync and so turns a log IO in the XFS filesystem on the loop device into a data IO in the backing filesystem. That is, the completion of log IOs in the loop filesystem are now dependent on completion of data IO in the backing filesystem. This can cause deadlocks when a flush daemon issues a log force with an inode locked because the IO completion of IO on the inode is blocked by the inode lock. This in turn prevents further data IO completion from occuring on all XFS filesystems on that CPU (due to the shared nature of the completion queues). This then prevents the log IO from completing because the log is waiting for data IO completion as well. The fix for this new completion order dependency issue is to make the IO completion inode locking non-blocking. If the inode lock can't be grabbed, simply requeue the IO completion back to the work queue so that it can be processed later. This prevents the completion queue from being blocked and allows data IO completion on other inodes to proceed, hence avoiding completion order dependent deadlocks. Signed-off-by:
Dave Chinner <david@fromorbit.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Alex Elder <aelder@sgi.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Christoph Hellwig authored
commit e8b217e7 upstream Date: Tue, 2 Feb 2010 10:16:26 +1100 We always need to flush the disk write cache and can't skip it just because the no inode attributes have changed. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Dave Chinner <david@fromorbit.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dave Chinner authored
commit cbe132a8 upstream If we hold onto reserved blocks when doing a remount,ro we end up writing the blocks used count to disk that includes the reserved blocks. Reserved blocks are not actually used, so this results in the values in the superblock being incorrect. Hence if we run xfs_check or xfs_repair -n while the filesystem is mounted remount,ro we end up with an inconsistent filesystem being reported. Also, running xfs_copy on the remount,ro filesystem will result in an inconsistent image being generated. To fix this, unreserve the blocks when doing the remount,ro, and reserved them again on remount,rw. This way a remount,ro filesystem will appear consistent on disk to all utilities. Signed-off-by:
Dave Chinner <david@fromorbit.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Christoph Hellwig authored
commit 9b00f307 upstream A "df" run on an NFS client of an exported XFS file system reports the wrong information for "available" blocks. When a block quota is enforced, the amount reported as free is limited by the quota, but the amount reported available is not (and should be). Reported-by:
Guk-Bong, Kwon <gbkwon@gmail.com> Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Alex Elder <aelder@sgi.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dave Chinner authored
commit e09f9860 upstream When swapping extents, we can corrupt inodes by swapping data forks that are in incompatible formats. This is caused by the two indoes having different fork offsets due to the presence of an attribute fork on an attr2 filesystem. xfs_fsr tries to be smart about setting the fork offset, but the trick it plays only works on attr1 (old fixed format attribute fork) filesystems. Changing the way xfs_fsr sets up the attribute fork will prevent this situation from ever occurring, so in the kernel code we can get by with a preventative fix - check that the data fork in the defragmented inode is in a format valid for the inode it is being swapped into. This will lead to files that will silently and potentially repeatedly fail defragmentation, so issue a warning to the log when this particular failure occurs to let us know that xfs_fsr needs updating/fixing. To help identify how to improve xfs_fsr to avoid this issue, add trace points for the inodes being swapped so that we can determine why the swap was rejected and to confirm that the code is making the right decisions and modifications when swapping forks. A further complication is even when the swap is allowed to proceed when the fork offset is different between the two inodes then value for the maximum number of extents the data fork can hold can be wrong. Make sure these are also set correctly after the swap occurs. Signed-off-by:
Dave Chinner <david@fromorbit.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Alex Elder <aelder@sgi.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dave Chinner authored
commit 4b6a4688 upstream When reclaiming stale inodes, we need to guarantee that inodes are unpinned before returning with a "clean" status. If we don't we can reclaim inodes that are pinned, leading to use after free in the transaction subsystem as transactions complete. Signed-off-by:
Dave Chinner <david@fromorbit.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Alex Elder <aelder@sgi.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dave Chinner authored
commit 57817c68 upstream We cannot do direct inode reclaim without taking the flush lock to ensure that we do not reclaim an inode under IO. We check the inode is clean before doing direct reclaim, but this is not good enough because the inode flush code marks the inode clean once it has copied the in-core dirty state to the backing buffer. It is the flush lock that determines whether the inode is still under IO, even though it is marked clean, and the inode is still required at IO completion so we can't reclaim it even though it is clean in core. Hence the requirement that we need to take the flush lock even on clean inodes because this guarantees that the inode writeback IO has completed and it is safe to reclaim the inode. With delayed write inode flushing, we could end up waiting a long time on the flush lock even for a clean inode. The background reclaim already handles this efficiently, so avoid all the problems by killing the direct reclaim path altogether. Signed-off-by:
Dave Chinner <david@fromorbit.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Alex Elder <aelder@sgi.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dave Chinner authored
commit 018027be upstream The reclaim code will handle flushing of dirty inodes before reclaim occurs, so avoid them when determining whether an inode is a candidate for flushing to disk when walking the radix trees. This is based on a test patch from Christoph Hellwig. Signed-off-by:
Dave Chinner <david@fromorbit.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Alex Elder <aelder@sgi.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dave Chinner authored
commit c8e20be0 upstream Make the inode tree reclaim walk exclusive to avoid races with concurrent sync walkers and lookups. This is a version of a patch posted by Christoph Hellwig that avoids all the code duplication. Signed-off-by:
Dave Chinner <david@fromorbit.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Alex Elder <aelder@sgi.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dave Chinner authored
commit fd45e478 upstream When we search for and find a busy extent during allocation we force the log out to ensure the extent free transaction is on disk before the allocation transaction. The current implementation has a subtle bug in it--it does not handle multiple overlapping ranges. That is, if we free lots of little extents into a single contiguous extent, then allocate the contiguous extent, the busy search code stops searching at the first extent it finds that overlaps the allocated range. It then uses the commit LSN of the transaction to force the log out to. Unfortunately, the other busy ranges might have more recent commit LSNs than the first busy extent that is found, and this results in xfs_alloc_search_busy() returning before all the extent free transactions are on disk for the range being allocated. This can lead to potential metadata corruption or stale data exposure after a crash because log replay won't replay all the extent free transactions that cover the allocation range. Modified-by:
Alex Elder <aelder@sgi.com> (Dropped the "found" argument from the xfs_alloc_busysearch trace event.) Signed-off-by:
Dave Chinner <david@fromorbit.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Alex Elder <aelder@sgi.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dave Chinner authored
commit 44e08c45 upstream Because inodes remain in cache much longer than inode buffers do under memory pressure, we can get the situation where we have stale, dirty inodes being reclaimed but the backing storage has been freed. Hence we should never, ever flush XFS_ISTALE inodes to disk as there is no guarantee that the backing buffer is in cache and still marked stale when the flush occurs. Signed-off-by:
Dave Chinner <david@fromorbit.com> Signed-off-by:
Alex Elder <aelder@sgi.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Christoph Hellwig authored
commit d6d59bad upstream We currently have some rather odd code in xfs_setattr for updating the a/c/mtime timestamps: - first we do a non-transaction update if all three are updated together - second we implicitly update the ctime for various changes instead of relying on the ATTR_CTIME flag - third we set the timestamps to the current time instead of the arguments in the iattr structure in many cases. This patch makes sure we update it in a consistent way: - always transactional - ctime is only updated if ATTR_CTIME is set or we do a size update, which is a special case - always to the times passed in from the caller instead of the current time The only non-size caller of xfs_setattr that doesn't come from the VFS is updated to set ATTR_CTIME and pass in a valid ctime value. Reported-by:
Eric Blake <ebb9@byu.net> Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Alex Elder <aelder@sgi.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Christoph Hellwig authored
commit b44b1126 upstream Add an assert for inodes not added to the inode cache in xfs_ireclaim, to make sure we're not going to introduce something like the famous nfsd inode cache bug again. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Alex Elder <aelder@sgi.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Jason Gunthorpe authored
commit 44a743f6 upstream Noticed that through glibc fallocate would return 28 rather than -1 and errno = 28 for ENOSPC. The xfs routines uses XFS_ERROR format positive return error codes while the syscalls use negative return codes. Fixup the two cases in xfs_vn_fallocate syscall to convert to negative. Signed-off-by:
Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by:
Eric Sandeen <sandeen@sandeen.net> Signed-off-by:
Alex Elder <aelder@sgi.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Andy Poling authored
commit fc5bc4c8 upstream Summary of problem: If a journal record wraps at the physical end of the journal, it has to be read in two parts in xlog_do_recovery_pass(): a read at the physical end and a read at the physical beginning. If xlog_bread() has to re-align the first read, the second read request does not take that re-alignment into account. If the first read was re-aligned, the second read over-writes the end of the data from the first read, effectively corrupting it. This can happen either when reading the record header or reading the record data. The first sanity check in xlog_recover_process_data() is to check for a valid clientid, so that is the error reported. Summary of fix: If there was a first read at the physical end, XFS_BUF_PTR() returns where the data was requested to begin. Conversely, because it is the result of xlog_align(), offset indicates where the requested data for the first read actually begins - whether or not xlog_bread() has re-aligned it. Using offset as the base for the calculation of where to place the second read data ensures that it will be correctly placed immediately following the data from the first read instead of sometimes over-writing the end of it. The attached patch has resolved the reported problem of occasional inability to recover the journal (reporting "bad clientid"). Signed-off-by:
Andy Poling <andy@realbig.com> Reviewed-by:
Alex Elder <aelder@sgi.com> Signed-off-by:
Alex Elder <aelder@sgi.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Christoph Hellwig authored
commit 80641dc6 upstream When completing I/O requests we must not allow the memory allocator to recurse into the filesystem, as we might deadlock on waiting for the I/O completion otherwise. The only thing currently allocating normal GFP_KERNEL memory is the allocation of the transaction structure for the unwritten extent conversion. Add a memflags argument to _xfs_trans_alloc to allow controlling the allocator behaviour. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reported-by:
Thomas Neumann <tneumann@users.sourceforge.net> Tested-by:
Thomas Neumann <tneumann@users.sourceforge.net> Reviewed-by:
Alex Elder <aelder@sgi.com> Signed-off-by:
Alex Elder <aelder@sgi.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Christoph Hellwig authored
commit c56c9631 upstream When xfs_free_eofblocks is called from ->release the VM might already hold the mmap_sem, but in the write path we take the iolock before taking the mmap_sem in the generic write code. Switch xfs_free_eofblocks to only trylock the iolock if called from ->release and skip trimming the prellocated blocks in that case. We'll still free them later on the final iput. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Alex Elder <aelder@sgi.com> Signed-off-by:
Alex Elder <aelder@sgi.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-