- 28 Jun, 2006 3 commits
-
-
Eric Moore authored
MPI Header Update Signed-off-by: Eric Moore <Eric.Moore@lsil.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Brian King authored
If a device gets offlined as a result of the Inquiry sent during scanning, the following oops can occur. After the disk gets put into the SDEV_OFFLINE state, the error handler sends back the failed inquiry, which wakes the thread doing the scan. This starts a race between the scanning thread freeing the scsi device and the error handler calling scsi_run_host_queues to restart the host. Since the disk is in the SDEV_OFFLINE state, scsi_device_get will still work, which results in __scsi_iterate_devices getting a reference to the scsi disk when it shouldn't. The following execution thread causes the oops: CPU 0 (scan) CPU 1 (eh) --------------------------------------------------------- scsi_probe_and_add_lun .... scsi_eh_offline_sdevs scsi_eh_flush_done_q scsi_destroy_sdev scsi_device_dev_release scsi_restart_operations scsi_run_host_queues __scsi_iterate_devices get_device scsi_device_dev_release_usercontext scsi_run_queue <---OOPS---> The patch fixes this by changing the state of the sdev to SDEV_DEL before doing the final put_device, which should prevent the race from occurring. Original oops follows: Badness in kref_get at lib/kref.c:32 Call Trace: [C00000002F4476D0] [C00000000000EE20] .show_stack+0x68/0x1b0 (unreliable) [C00000002F447770] [C00000000037515C] .program_check_exception+0x1cc/0x5a8 [C00000002F447840] [C00000000000446C] program_check_common+0xec/0x100 Exception: 700 at .kref_get+0x10/0x28 LR = .kobject_get+0x20/0x3c [C00000002F447B30] [C00000002F447BC0] 0xc00000002f447bc0 (unreliable) [C00000002F447BB0] [C000000000254BDC] .get_device+0x20/0x3c [C00000002F447C30] [D000000000063188] .scsi_device_get+0x34/0xdc [scsi_mod] [C00000002F447CC0] [D0000000000633EC] .__scsi_iterate_devices+0x50/0xbc [scsi_mod] [C00000002F447D60] [D00000000006A910] .scsi_run_host_queues+0x34/0x5c [scsi_mod] [C00000002F447DF0] [D000000000069054] .scsi_error_handler+0xdb4/0xe44 [scsi_mod] [C00000002F447EE0] [C00000000007B4E0] .kthread+0x128/0x178 [C00000002F447F90] [C000000000025E84] .kernel_thread+0x4c/0x68 Unable to handle kernel paging request for <7>PCI: Enabling device: (0002:41:01.1), cmd 143 data at address 0x000001b8 Faulting instruction address: 0xd0000000000698e4 sym1: <1010-66> rev 0x1 at pci 0002:41:01.1 irq 216 sym1: No NVRAM, ID 7, Fast-80, LVD, parity checking sym1: SCSI BUS has been reset. scsi2 : sym-2.2.2 cpu 0x0: Vector: 300 (Data Access) at [c00000002f447a30] pc: d0000000000698e4: .scsi_run_queue+0x2c/0x218 [scsi_mod] lr: d00000000006a904: .scsi_run_host_queues+0x28/0x5c [scsi_mod] sp: c00000002f447cb0 msr: 9000000000009032 dar: 1b8 dsisr: 40000000 current = 0xc0000000045fecd0 paca = 0xc00000000048ee80 pid = 1123, comm = scsi_eh_1 enter ? for help [c00000002f447d60] d00000000006a904 .scsi_run_host_queues+0x28/0x5c [scsi_mod] [c00000002f447df0] d000000000069054 .scsi_error_handler+0xdb4/0xe44 [scsi_mod] [c00000002f447ee0] c00000000007b4e0 .kthread+0x128/0x178 [c00000002f447f90] c000000000025e84 .kernel_thread+0x4c/0x68 Signed-off-by: Brian King <brking@us.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Brian King authored
This is a resend of a patch I generated in response to an email sent by Ruben Faelens <parasietje@gmail.com>. His original email to linux-scsi requested a method in which he could spin down a scsi disk when not in use and have the kernel automatically spin it back up when an I/O was generated to the disk. The infrastructure to automatically spin a disk up has been in the scsi error handler for some time now, but it is not enabled by default. This patch adds an sd sysfs attribute which allows userspace to enable this behavior. Signed-off-by: Brian King <brking@us.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
- 27 Jun, 2006 5 commits
-
-
James Smart authored
Original post was incorrect as it didn't realize that we already had a self-referenc due to device_initialize(), and we were really only missing the put on our own reference. This was hidden by the other bug which had the midlayer reusing stargets after they were already free, which was doing too many puts on our rport. Updating FC transport for: - Add put in fc_rport_final_delete(), to release the rport. Prior, we were leaving the rport with a reference, thus the shost with references, etc. If the driver was unloaded, shosts and rports remained, along with work threads, etc - Fix fc_rport_create failure path - too many put's on parent - Add commenting to easily track ref taking. Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
James Smart authored
Updated patch to address comments from Pat Mansfield and Michael Reed: Bumped max to 600 (10mins). Set default dev_loss_tmo to a value other than the max (30s). Signed-off-by: James Smart <James.Smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
James Smart authored
In a prior posting to linux-scsi on the fc transport and workq deadlocks, we noted a second error that did not have a patch: http://marc.theaimsgroup.com/?l=linux-scsi&m=114467847711383&w=2 - There's a deadlock where scsi_remove_target() has to sit behind scsi_scan_target() due to contention over the scan_lock(). Subsequently we posted a request for comments about the deadlock: http://marc.theaimsgroup.com/?l=linux-scsi&m=114469358829500&w=2 This posting resolves the second error. Here's what we now understand, and are implementing: If the lldd deletes the rport while a scan is active, the sdev's queue is blocked which stops the issuing of commands associated with the scan. At this point, the scan stalls, and does so with the shost->scan_mutex held. If, at this point, if any scan or delete request is made on the host, it will stall waiting for the scan_mutex. For the FC transport, we queue all delete work to a single workq. So, things worked fine when competing with the scan, as long as the target blocking the scan was the same target at the top of our delete workq, as the delete workq routine always unblocked just prior to requesting the delete. Unfortunately, if the top of our delete workq was for a different target, we deadlock. Additionally, if the target blocking scan returned, we were unblocking it in the scan workq routine, which really won't execute until the existing stalled scan workq completes (e.g. we're re-scheduling it while it is in the midst of its execution). This patch moves the unblock out of the workq routines and moves it to the context that is scheduling the work. This ensures that at some point, we will unblock the target that is blocking scan. Please note, however, that the deadlock condition may still occur while it waits for the transport to timeout an unblock on a target. Worst case, this is bounded by the transport dev_loss_tmo (default: 30 seconds). Finally, Michael Reed deserves the credit for the bulk of this patch, analysis, and it's testing. Thank you for your help. Note: The request for comments statements about the gross-ness of the scan_mutex still stand. Signed-off-by: Michael Reed <mdr@sgi.com> Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
James Smart authored
This removes the duplicate functionality which had been added to the lpfc driver. Signed-off-by: James Smart <James.Smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
James Smart authored
The scsi midlayer portion of the patch Signed-off-by: James Smart <James.Smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
- 26 Jun, 2006 32 commits
-
-
Salyzyn, Mark authored
This may seem like a DILLIGAF, but after chatting with the F/W folks, there is no harm in dropping the page calculation as denoted in the enclosed patch for these older adapters in this new age of 4GB+ memory sticks. Any resource optimization within the old-old-old adapters for systems with less than 4G of memory is of little consequence. The existing AAC_QUIRK_31BIT flag in linit.c should look after the rest of the legacy hardware DMA limitations. Signed-off-by: Mark Salyzyn <aacraid@adaptec.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Matt Mackall authored
The scsi layer is already calling add_disk_randomness in scsi_end_request. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Alan Cox authored
We currently stuff a truncated size into the geometry logic and return the result which can produce bizarre reports for a 4Tb array. Since that mapping logic isn't useful for disks that big don't try and map this way at all. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
GOTO Masanori authored
Add scsi_add_host() failure handling for nsp32 and silence warning. drivers/scsi/nsp32.c:2888: warning: ignoring return value of 'Scsi_add_host', declared with attribute warn_unused_result Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: GOTO Masanori <gotom@sanori.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Randy Dunlap authored
From: Randy Dunlap <rdunlap@xenotime.net> Fix sparse warnings: use NULL instead of 0 for pointers: drivers/scsi/lpfc/lpfc_els.c:827:56: warning: Using plain integer as NULL pointer drivers/scsi/lpfc/lpfc_els.c:2781:18: warning: Using plain integer as NULL pointer drivers/scsi/lpfc/lpfc_els.c:2782:18: warning: Using plain integer as NULL pointer drivers/scsi/lpfc/lpfc_init.c:951:21: warning: Using plain integer as NULL pointer drivers/scsi/lpfc/lpfc_init.c:956:20: warning: Using plain integer as NULL pointer Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Acked-by: James Smart <James.Smart@Emulex.Com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Andrew Vasquez authored
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Andrew Vasquez authored
Original code incorrectly assigned it to the driver's link-down-timeout value (a value in seconds). Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Andrew Vasquez authored
Also remove qla2xxx_probe_one/qla2xxx_remove_one stubs previously used with external firmware module loaders. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Andrew Vasquez authored
As there is no point in failing the initialization process when firmware informs the host software that it could not transition beyond a CONFIG_WAIT nor WAIT_FOR_LOGIN state. Previous logic would mark such conditions as a general *failure* and subsequently tear-down the scsi-host during initialization. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Andrew Vasquez authored
Similar in form to QLogic's standard offering -- via the 'extended_error_logging' module parameter. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Andrew Vasquez authored
- macro usage statements should terminate with a ';' - remove unused macros. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Andrew Vasquez authored
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Andrew Vasquez authored
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Andrew Vasquez authored
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Andrew Vasquez authored
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Andrew Vasquez authored
The host section of ISP24xx NVRAMs contain a new bit which allows a user to selectively disable ports of an HBA. These ports (hosts) will not be presented to the midlayer. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Andrew Vasquez authored
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Andrew Vasquez authored
- Defer firmware dump-data raw-to-textual conversion to user-space. - Add module parameter (ql2xallocfwdump) to allow for per-HBA allocations of firmware dump memory. - Dump request and response queue data as per firmware group request. - Add extended firmware trace support for ISP24XX/ISP54XX chips. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Alan Stern authored
We have to be able to remove SCSI devices even when they are suspended, so QUIESCE -> CANCEL must be a legal state transition. This patch (as727) adds the transition to the state machine. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Luben Tuikov authored
This patch simplifies "good_bytes" computation in sd_rw_intr(). sd: "good_bytes" computation is always done in terms of the resolution of the device's medium, since after that it is the number of good bytes we pass around and other layers/contexts (as opposed ot sd) can translate that to their own resolution (block layer:512). It also makes scsi_io_completion() processing more straightforward, eliminating the 3rd argument to the function. It also fixes a couple of bugs like not checking return value, using "break" instead of "return;", etc. I've been running with this patch for some time now on a test (do-it-all) system. Signed-off-by: Luben Tuikov <ltuikov@yahoo.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Hannes Reinecke authored
Even with the latest fixes aic79xx still occasionally triggers the BUG_ON in slave_destroy. Rather than trying to figure out the various levels of interaction here I've decided to remove the callback altogether. The primary reason for the slave_alloc / slave_destroy is to keep an index of pointers to the sdevs associated with a given target. However, by changing the arguments to the affected functions slightly it's possible to avoid the use of that index entirely. The only performance penalty we'll incur is in writing the information for /proc/scsi/XXX, as we'll have to recurse over all available sdevs to find the correct ones. But I doubt that reading from /proc is in any way time-critical. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Hannes Reinecke authored
According to Anthony Cheung all HP XP arrays with "OPEN-" types support REPORT_LUN. So there is no reason why we shouldn't use it. Signed-off-by: Anthony Cheung <anthony.cheung@hp.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Sumant Patro authored
The patch adds support for a ZCR controller (Device ID : 0x413). It also has a critical bug fix : Disable controller interrupt before firing INIT cmd to FW. Interrupt is enabled after required initialization is over. This is done to ensure that driver is ready to handle interrupts when it is generated by the controller. Signed-off-by: Sumant Patro <Sumant.Patro@lsil.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Dave C Boutcher authored
This patch fixes a condition where ibmvscsi treats a transport error as a "busy" condition, so no errors were returned to the scsi mid-layer. In a RAID environment this means that I/O hung rather than failing over. Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Douglas Gilbert authored
- add 'virtual_gb' parameter to simulate large storage (by wrapping in dev_size_mb megabytes of actual ram) - add 'no_lun_0' parameter to skip lun 0 on each target (but still respond as required to INQUIRY + REPORT LUNS) - add well know lu support - add MODE SELECT commands support [pages: 0xa and 0x1c] - add LOG SENSE command support [pages: 0xd and 0x2f] - add READ CAPACITY (16) support - increase number of mode pages supported (to read), mainly transport specific (SAS) mode (sub)pages - add more VPD pages and extend others, including ATA information VPD page - START STOP UNIT now maintains a state machine - READ (16) and WRITE (16) cope with lbas larger than 32 bits (needed for the 'virtual_gb' parameter) - allow single command transfers up to 32 MB - more precise error (sense data) messages Signed-off-by: Douglas Gilbert <dougg@torque.net> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Dave Jones authored
On Wed, Jun 21, 2006 at 07:00:34PM +0000, Linux Kernel wrote: > commit 67d59dfd > tree ae85703651d81740f4a6cd398f9dd4d6aabe6a2f > parent 6db874fb > author James Bottomley <James.Bottomley@steeleye.com> Wed, 14 Jun 2006 07:31:19 -0500 > committer James Bottomley <jejb@mulgrave.il.steeleye.com> Tue, 20 Jun 2006 05:34:01 -0500 > > [SCSI] 53c700: remove reliance on deprecated cmnd fields > ... > > + SDp->hostdata = kmalloc(GFP_KERNEL, sizeof(struct NCR_700_sense)); > + > + if (!SDp->hostdata) > + return -ENOMEM; "I'll take reversed arguments for $100 please Alex". Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
Malcolm Parsons authored
binfmt_flat.c calls set_personality with PER_LINUX as the personality. On the arm architecture this results in the program running in 26bit usermode. PER_LINUX_32BIT should be used instead. This doesn't affect other architectures that use binfmt_flat. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Greg Ungerer authored
Change enable_irq() macro to be a statement, not expression. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Greg Ungerer authored
Fix PLL setting for the Coldfire 5249 CPU. This brings it into line with the new style frequency configuration of m68knommu parts. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Greg Ungerer authored
Fix flush code for the ColdFire 5206/5206e/5272 cases. Add support for the new ColdFire 532x CPU family Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Philippe De Muyter authored
Here is a patch to the system call handling for 5307/5272/etc to: - fix the strace support (one tested the wrong bit) - make all system calls a little bit faster by inlining set_esp0 and supporting ENOSYS out of the critical path. - remove extraneous spaces Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Greg Ungerer authored
This patch solve a bug triggered by execvp (this function use calloc to store the argument list and gcc 3.4.x align the stack to word, not to dword). This situation aren't related to signal handling and all 2.6.x have the bug. On ColdFire targets we must force the stack to be aligned. Original patch from Andrea Tarani <andrea.tarani@gilbarco.com>, Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-