- 05 Feb, 2007 17 commits
-
-
David Teigland authored
Some common, non-error messages should use log_debug instead of log_error so they can be turned off. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
S. Wendy Cheng authored
Second round of gfs2_rename lock re-ordering to allow Anaconda adding root partition on top of gfs2. Previous to this patch the recursive lock detector in glock.c can be triggered due to attempting to lock the rgrp twice. This fixes it by checking to see whether the rgrp is already locked. This fixes Red Hat bugzilla #221237 Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
Russell Cattelan authored
Update the quilt header comments to match the code changes. Change gfs2_lookup_simple to return an error in the case of a NULL inode. The callers of gfs2_lookup_simple do not check for NULL in the no entry case and such would end up dereferencing a NULL ptr. This fixes: http://projects.info-pull.com/mokb/MOKB-15-11-2006.htmlSigned-off-by: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
In case of unlinked files with dirty pages GFS2 wasn't clearing the pages in quite the right order. This patch clears the pages earlier (before the qlock_dq) to avoid the situation that the release of the glock results in attempting to write back data that has already been deallocated. This fixes Red Hat bugzilla: #220117 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
Patrick Caulfield authored
I just noticed this message when testing some other changes I'd made to lowcomms (to use workqueues) but the problem seems to be in the current git trees too. I'm amazed no-one has seen it. BUG: spinlock already unlocked on CPU#1, dlm_recoverd/16868 Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
Patrick Caulfield authored
I was a little over-enthusiastic turning schedule() calls int cond_sched() when fixing the DLM for Andrew Morton. These four should really be calls to schedule() or the dlm can busy-wait. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
S. Wendy Cheng authored
Bugzilla 215088 Fix deadlock in gfs2_change_nlink() while installing RHEL5 into GFS2 partition. The gfs2_rename() apparently needs block allocation for the new name (into the directory) where it requires rg locks. At the same time, while updating the nlink count for the replaced file, gfs2_change_nlink() tries to return the inode meta-data back to resource group where it needs rg locks too. Our logic doesn't allow process to acquire these locks recursively by the same process (RHEL installer) that results a BUG call. This only happens within rename code path and only if the destination file exists before the rename operation. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
This is partially derrived from a patch written by Russell Cattelan. It fixes a bug where there is a race between readpages and truncate by ignoring readpages for stuffed files. This is ok because a stuffed file will never be more than one block (minus sizeof(struct gfs2_dinode)) in size and block size is always less than page size, so we do not lose anything efficiency-wise by not doing readahead for stuffed files. They will have already been "read ahead" by the action of reading the inode in, in the first place. This is the remaining part of the fix for Red Hat bugzilla #218966 which had not yet made it upstream. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Russell Cattelan <cattelan@redhat.com>
-
Steven Whitehouse authored
This patch fixes Red Hat bugzilla #212627 in which a deadlock occurs due to trying to take the i_mutex while holding a glock. The correct locking order is defined as i_mutex -> glock in all cases. I've left dealing with allocating writes. I know that we need to do that, but for now this should do the trick. We don't need to take the i_mutex on write, because the VFS has already taken it for us. On read we don't need it since the glock is enough protection. The reason that I've made some of the checks into a separate function is that we'll need to do the checks again in the allocating write case eventually, so this is partly in preparation for this. Likewise the return value test of != 1 might look a bit odd and thats because we'll need a third return value in case of requiring an allocation. I've made the change to deferred mode on the glock to ensure flushing read caches on other nodes. I notice that (using blktrace to look at whats going on) we appear to do a better job of large I/Os than ext3 after this patch (in terms of not splitting up the I/Os). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Wendy Cheng <wcheng@redhat.com>
-
Adrian Bunk authored
Remove the following unused functions: - lowcomms_send_message() - lowcomms_max_buffer_size() Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
David Teigland authored
When the dlm fakes an unlock/cancel reply from a failed node using a stub message struct, it wasn't setting the flags in the stub message. So, in the process of receiving the fake message the lkb flags would be updated and cleared from the zero flags in the message. The problem observed in tests was the loss of the USER flag which caused the dlm to think a user lock was a kernel lock and subsequently fail an assertion checking the validity of the ast/callback field. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
David Teigland authored
LVB's are not sent as part of new requests, but the code receiving the request was copying data into the lvb anyway. The space in the message where it mistakenly thought the lvb lived actually contained the resource name, so it wound up incorrectly copying this name data into the lvb. Fix is to just create the lvb, not copy junk into it. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
David Teigland authored
The send_args() function is used to copy parameters into a message for a number different message types. Only some of those types are set up beforehand (in create_message) to include space for sending lvb data. send_args was wrongly copying the lvb for all message types as long as the lock had an lvb. This means that the lvb data was being written past the end of the message into unknown space. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
David Teigland authored
Check if we receive a message from another lockspace member running a version of the dlm with an incompatible inter-node message protocol. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
David Teigland authored
A reply to a recovery message will often be received after the relevant recovery sequence has aborted and the next recovery sequence has begun. We need to ignore replies to these old messages from the previous recovery. There's already a way to do this for synchronous recovery requests using the rc_id number, but not for async. Each recovery sequence already has a locally unique sequence number associated with it. This patch adds a field to the rcom (recovery message) structure where this recovery sequence number can be placed, rc_seq. When a node sends a reply to a recovery request, it copies the rc_seq number it received into rc_seq_reply. When the first node receives the reply to its recovery message, it will check whether rc_seq_reply matches the current recovery sequence number, ls_recover_seq, and if not then it ignores the old reply. An old, inadequate approach to filtering out old replies (checking if the current stage of recovery has moved back to the start) has been removed from two spots. The protocol version number is changed to reflect the different rcom structures. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
David Teigland authored
There's a chance the new master of resource hasn't learned it's the new master before another node sends it a lock during recovery. The node sending the lock needs to resend if this happens. - A sends a master lookup for resource R to C - B sends a master lookup for resource R to C - C receives A's lookup, assigns A to be master of R and sends a reply back to A - C receives B's lookup and sends a reply back to B saying that A is the master - B receives lookup reply from C and sends its lock for R to A - A receives lock from B, doesn't think it's the master of R and sends an error back to B - A receives lookup reply from C and becomes master of R - B gets error back from A and resends its lock back to A (this resending is what this patch does) - A receives lock from B, it now sees it's the master of R and takes the lock Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
David Teigland authored
If an fs has already been shut down, a lockfs callback should do nothing. An fs that's been shut down can't acquire locks or do anything with respect to the cluster. Also, remove FIXME comment in withdraw function. The missing bits of the withdraw procedure are now all done by user space. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
- 04 Feb, 2007 3 commits
-
-
Linus Torvalds authored
-
Frédéric Riss authored
When calling into the EFI firmware, the parameters need to be passed on the stack. The recent change to use -mregparm=3 breaks x86 EFI support. This patch is needed to allow the new Intel-based Macs to suspend to ram (efi.get_time is called during the suspend phase). Signed-off-by: Frederic Riss <frederic.riss@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Al Viro authored
That code doesn't do what its author apparently thought it would do... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 03 Feb, 2007 11 commits
-
-
Linus Torvalds authored
* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: [SCSI] sd: udev accessing an uninitialized scsi_disk field results in a crash [SCSI] st: A MTIOCTOP/MTWEOF within the early warning will cause the file number to be incorrect [SCSI] qla4xxx: bug fixes [SCSI] Fix scsi_add_device() for async scanning
-
Jeff Garzik authored
x86-64 is missing these: Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
John Keller authored
The SN Altix platform does not conform to the IOSAPIC IRQ routing model. Add code in acpi_unregister_gsi() to check if (acpi_irq_model == ACPI_IRQ_MODEL_PLATFORM) and return. Due to an oversight, this code was not added previously when similar code was added to acpi_register_gsi(). http://marc.theaimsgroup.com/?l=linux-acpi&m=116680983430121&w=2Signed-off-by: John Keller <jpk@sgi.com> Acked-by: Len Brown <lenb@kernel.org> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrew Morton authored
Andrew Vasquez is reporting as-iosched oopses and a 65% throughput slowdown due to the recent special-casing of direct-io against blockdevs. We don't know why either of these things are occurring. The patch minimally reverts us back to the 2.6.19 code for a 2.6.20 release. Cc: Andrew Vasquez <andrew.vasquez@qlogic.com> Cc: Ken Chen <kenchen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Mike Frysinger authored
We went and named them __NR_sys_foo instead of __NR_foo. It may be too late to change this, but we can at least add the proper names now. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Peter Korsgaard authored
smc911x_phy_configure's error handling unconditionally unlocks the spinlock even if it wasn't locked. Patch fixes it. Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk> Cc: Jeff Garzik <jeff@garzik.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Magnus Damm authored
This patch fixes up ia64 kexec support for HP rx2620 hardware. It does this by skipping migration of already disabled irqs. This is most likely a problem on other ia64 platforms as well, but I've only been able to reproduce it on one machine so far. The full story is that handle_bad_irq() gets invoked before starting the new kernel without this patch. This seems to happen when fixup_irqs() calls generic_handle_irq() on already migrated (and disabled) irqs. So by avoiding migration of disabled irqs we stay away of handle_bad_irq(). The code has been tested on three different ia64 machines, all with good results. It is possible to trigger the same bug by offlining a processor using echo 0 > /sys/devices/system/cpu/cpuX/online. More detailed information is available in the following mail thread: http://lists.osdl.org/pipermail/fastboot/2007-January/thread.html#5774Signed-off-by: Magnus Damm <magnus@valinux.co.jp> Acked-by: Simon Horman <horms@verge.net.au> Acked-by: Zou, Nanhai <nanhai.zou@intel.com> Acked-by: Jay Lan <jlan@sgi.com> Acked-by: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Ken Chen authored
An AIO bug was reported that sleeping function is being called in softirq context: BUG: warning at kernel/mutex.c:132/__mutex_lock_common() Call Trace: [<a000000100577b00>] __mutex_lock_slowpath+0x640/0x6c0 [<a000000100577ba0>] mutex_lock+0x20/0x40 [<a0000001000a25b0>] flush_workqueue+0xb0/0x1a0 [<a00000010018c0c0>] __put_ioctx+0xc0/0x240 [<a00000010018d470>] aio_complete+0x2f0/0x420 [<a00000010019cc80>] finished_one_bio+0x200/0x2a0 [<a00000010019d1c0>] dio_bio_complete+0x1c0/0x200 [<a00000010019d260>] dio_bio_end_aio+0x60/0x80 [<a00000010014acd0>] bio_endio+0x110/0x1c0 [<a0000001002770e0>] __end_that_request_first+0x180/0xba0 [<a000000100277b90>] end_that_request_chunk+0x30/0x60 [<a0000002073c0c70>] scsi_end_request+0x50/0x300 [scsi_mod] [<a0000002073c1240>] scsi_io_completion+0x200/0x8a0 [scsi_mod] [<a0000002074729b0>] sd_rw_intr+0x330/0x860 [sd_mod] [<a0000002073b3ac0>] scsi_finish_command+0x100/0x1c0 [scsi_mod] [<a0000002073c2910>] scsi_softirq_done+0x230/0x300 [scsi_mod] [<a000000100277d20>] blk_done_softirq+0x160/0x1c0 [<a000000100083e00>] __do_softirq+0x200/0x240 [<a000000100083eb0>] do_softirq+0x70/0xc0 See report: http://marc.theaimsgroup.com/?l=linux-kernel&m=116599593200888&w=2 flush_workqueue() is not allowed to be called in the softirq context. However, aio_complete() called from I/O interrupt can potentially call put_ioctx with last ref count on ioctx and triggers bug. It is simply incorrect to perform ioctx freeing from aio_complete. The bug is trigger-able from a race between io_destroy() and aio_complete(). A possible scenario: cpu0 cpu1 io_destroy aio_complete wait_for_all_aios { __aio_put_req ... ctx->reqs_active--; if (!ctx->reqs_active) return; } ... put_ioctx(ioctx) put_ioctx(ctx); __put_ioctx bam! Bug trigger! The real problem is that the condition check of ctx->reqs_active in wait_for_all_aios() is incorrect that access to reqs_active is not being properly protected by spin lock. This patch adds that protective spin lock, and at the same time removes all duplicate ref counting for each kiocb as reqs_active is already used as a ref count for each active ioctx. This also ensures that buggy call to flush_workqueue() in softirq context is eliminated. Signed-off-by: "Ken Chen" <kenchen@google.com> Cc: Zach Brown <zach.brown@oracle.com> Cc: Suparna Bhattacharya <suparna@in.ibm.com> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Badari Pulavarty <pbadari@us.ibm.com> Cc: <stable@kernel.org> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Adrian Bunk authored
Fix this by letting NF_CONNTRACK_H323 depend on (IPV6 || IPV6=n). Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Patrick McHardy authored
CC net/netfilter/nf_conntrack_netlink.o net/netfilter/nf_conntrack_netlink.c: In function 'ctnetlink_conntrack_event': net/netfilter/nf_conntrack_netlink.c:392: error: 'struct nf_conn' has no member named 'mark' make[3]: *** [net/netfilter/nf_conntrack_netlink.o] Error 1 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nagendra Singh Tomar authored
sd_probe() calls class_device_add() even before initializing the sdkp->device variable. class_device_add() eventually results in the user mode udev program to be called. udev program can read the the allow_restart attribute of the newly created scsi device. This is resulting in a crash as the show function for allow_restart (i.e sd_show_allow_restart) returns the attribute value by reading the sdkp->device->allow_restart variable. As the sdkp->device is not initialized before calling the user mode hotplug helper, this results in a crash. The patch below solves it by calling class_device_add() only after the necessary fields in the scsi_disk structure are initialized properly. Signed-off-by: Nagendra Singh Tomar <nagendra_tomar@adaptec.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
- 02 Feb, 2007 9 commits
-
-
Linus Torvalds authored
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: libata: Initialize nbytes for internal sg commands libata: Fix ata_busy_wait() kernel docs pata_via: Correct missing comments pata_atiixp: propogate cable detection hack from drivers/ide to the new driver ahci/pata_jmicron: fix JMicron quirk
-
Brian King authored
Some LLDDs, like ipr, use nbytes and pad_len to determine the total data transfer length of a command. Make sure nbytes gets initialized for internally generated commands. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
-
Alan authored
> Looks like you should use ata_busy_wait() here, rather than reproducing > the same code again. It waits in 10uS chunks while 1uS chunks were used in the workaround. Could indeed do that once I know the fix is right. While I'm at it the ata_busy_wait kerneldoc is borked so here's a fix Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
-
Alan authored
The 8237S was added to the chipsets but not to the comments. Fix this Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
-
Alan authored
Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
-
Tejun Heo authored
For all JMicrons except for 361 and 368, AHCI mode enable bits in the Control(1) should be set. This used to be done in both ahci and pata_jmicron but while moving programming to PCI quirk, it was removed from ahci part while still left in pata_jmicron. The implemented JMicron PCI quirk was incorrect in that it didn't program AHCI mode enable bits. If pata_jmicron is loaded first and programs those bits, the ahci ports work; otherwise, ahci device detection fails miserably. This patch makes JMicron PCI quirk clear SATA IDE mode bits and set AHCI mode bits and remove the respective part from pata_jmicron. Tested on JMB361, 363 and 368. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
-
Linus Torvalds authored
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: spidernet : fix memory leak in spider_net_stop e100: fix napi ifdefs removing needed code netxen patches
-
Linus Torvalds authored
* master.kernel.org:/pub/scm/linux/kernel/git/davem/bnx2-2.6: [BNX2]: PHY workaround for 5709 A0.
-
Linus Torvalds authored
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [NET_SCHED]: act_ipt: fix regression in ipt action
-