Commit 269f8f70 authored by Linus Torvalds's avatar Linus Torvalds

v2.4.9.13 -> v2.4.9.14

  - Richard Gooch: devfs update
  - Andrea Arcangeli: clean up/fix ramdisk handling now that it's in page cache
  - Al Viro: follow up the above with initrd cleanups
  - Keith Owens: get rid of drivers/scsi/53c700-mem.c file
  - Trond Myklebust: RPC over TCP race fix
  - Greg KH: USB update (ohci understands USB_ZERO_PACKET)
  - me: clean up reference bit handling, fix silly GFP_ATOMIC allocation bug
parent a27c6530
......@@ -1675,3 +1675,83 @@ Changes for patch v182
- Fixed number leak for /dev/cdroms/cdrom%d
- Fixed number leak for /dev/discs/disc%d
===============================================================================
Changes for patch v183
- Fixed bug in <devfs_setup> which could hang boot process
===============================================================================
Changes for patch v184
- Documentation typo fix for fs/devfs/util.c
- Fixed drivers/char/stallion.c for devfs
- Added DEVFSD_NOTIFY_DELETE event
- Updated README from master HTML file
- Removed #include <asm/segment.h> from fs/devfs/base.c
===============================================================================
Changes for patch v185
- Made <block_semaphore> and <char_semaphore> in fs/devfs/util.c
private
- Fixed inode table races by removing it and using inode->u.generic_ip
instead
- Moved <devfs_read_inode> into <get_vfs_inode>
- Moved <devfs_write_inode> into <devfs_notify_change>
===============================================================================
Changes for patch v186
- Fixed race in <devfs_do_symlink> for uni-processor
- Updated README from master HTML file
===============================================================================
Changes for patch v187
- Fixed drivers/char/stallion.c for devfs
- Fixed drivers/char/rocket.c for devfs
- Fixed bug in <devfs_alloc_unique_number>: limited to 128 numbers
===============================================================================
Changes for patch v188
- Updated major masks in fs/devfs/util.c up to Linus' "no new majors"
proclamation. Block: were 126 now 122 free, char: were 26 now 19 free
- Updated README from master HTML file
- Removed remnant of multi-mount support in <devfs_mknod>
- Removed unused DEVFS_FL_SHOW_UNREG flag
===============================================================================
Changes for patch v189
- Removed nlink field from struct devfs_inode
- Removed auto-ownership for /dev/pty/* (BSD ptys) and used
DEVFS_FL_CURRENT_OWNER|DEVFS_FL_NO_PERSISTENCE for /dev/pty/s* (just
like Unix98 pty slaves) and made /dev/pty/m* rw-rw-rw- access
===============================================================================
Changes for patch v190
- Updated README from master HTML file
- Replaced BKL with global rwsem to protect symlink data (quick and
dirty hack)
===============================================================================
Changes for patch v191
- Replaced global rwsem for symlink with per-link refcount
===============================================================================
Changes for patch v192
- Removed unnecessary #ifdef CONFIG_DEVFS_FS from arch/i386/kernel/mtrr.c
- Ported to kernel 2.4.10-pre11
- Set inode->i_mapping->a_ops for block nodes in <get_vfs_inode>
......@@ -3,7 +3,7 @@ Devfs (Device File System) FAQ
Linux Devfs (Device File System) FAQ
Richard Gooch
26-APR-2001
23-AUG-2001
-----------------------------------------------------------------------------
......@@ -18,17 +18,14 @@ find out more about it at:
http://www.atnf.csiro.au/~rgooch/linux/
NEWSFLASH: The official 2.3.46 kernel has
included the devfs patch. Future patches will be released which
build on this. These patches are rolled into Linus' tree from time to
time.
A mailing list is available which you may subscribe to. Send
email
to majordomo@oss.sgi.com with the following line in the
body of the message:
subscribe devfs
The list is archived at
To unsubscribe, send the message body:
unsubscribe devfs
instead. The list is archived at
http://oss.sgi.com/projects/devfs/archive/.
......@@ -71,6 +68,8 @@ Alternatives to devfs
Other resources
Translations of this document
-----------------------------------------------------------------------------
......@@ -82,7 +81,7 @@ on your root filesystem. Kernel device drivers can register devices by
name rather than major and minor numbers. These devices will appear in
devfs automatically, with whatever default ownership and
protection the driver specified. A daemon (devfsd) can be used to
override these defaults.
override these defaults. Devfs has been in the kernel since 2.3.46.
NOTE that devfs is entirely optional. If you prefer the old
disc-based device nodes, then simply leave CONFIG_DEVFS_FS=n (the
......@@ -604,6 +603,20 @@ services. Unfortunately, it's also fragile, complex and undocumented
has problems with symbolic links. Append the following lines to your
/etc/securetty file:
vc/1
vc/2
vc/3
vc/4
vc/5
vc/6
vc/7
vc/8
This will not weaken security. If you have a version of util-linux
earlier than 2.10.h, please upgrade to 2.10.h or later. If you
absolutely cannot upgrade, then also append the following lines to
your /etc/securetty file:
1
2
3
......@@ -618,27 +631,13 @@ network (a password is still required, though). However, since there
are problems with dealing with symlinks, I'm suspicious of the level
of security offered in any case.
A better solution is to install util-linux-2.10.h or later, which
fixes a bug with ttyname handling in the login programme. Then append
the following lines to your /etc/securetty file:
vc/1
vc/2
vc/3
vc/4
vc/5
vc/6
vc/7
vc/8
This will not weaken security.
XFree86
While not essential, it's probably a good idea to upgrade to XFree86
4.0, as patches went in to make it more devfs-friendly. If you don't,
you'll probably need to apply the following patch to
/etc/security/console.perms so that ordinary users can run
startx.
startx. Note that not all distributions have this file (e.g. Debian),
so if it's not present, don't worry about it.
--- /etc/security/console.perms.orig Sat Apr 17 16:26:47 1999
+++ /etc/security/console.perms Fri Feb 25 23:53:55 2000
......@@ -691,9 +690,12 @@ Alternatively, use the same technique used for unsupported drivers
described above.
The Kernel
Finally, you need to make sure devfs is compiled into your
kernel. Set CONFIG_DEVFS_FS=y and CONFIG_DEVFS_MOUNT=y and recompile
your kernel. At boot, devfs will be mounted onto /dev.
Finally, you need to make sure devfs is compiled into your kernel. Set
CONFIG_EXPERIMENTAL=y, CONFIG_DEVFS_FS=y and CONFIG_DEVFS_MOUNT=y by
using favourite configuration tool (i.e. make config or
make xconfig) and then make dep; make clean and then
recompile your kernel and modules. At boot, devfs will be mounted onto
/dev.
If you encounter problems booting (for example if you forgot a
configuration step), you can pass devfs=nomount at the kernel
......@@ -805,8 +807,17 @@ mounted-over /dev. However, you can still use a regular
directory to store the database. The sample /etc/devfsd.conf
file above may still be used. You will need to create the
/dev-state directory prior to installing devfsd. If you have
old permissions in /dev, then just copy the device nodes over
to the new directory.
old permissions in /dev, then just copy (or move) the device
nodes over to the new directory.
Which method is better?
The best method is to have the permissions database stored in the
mounted-over /dev. This is because you will not need to copy
device nodes over to /dev-state, and because it allows you to
switch between devfs and non-devfs kernels, without requiring you to
copy permissions between /dev-state (for devfs) and
/dev (for non-devfs).
Dealing with drivers without devfs support
......@@ -1038,6 +1049,13 @@ that class are available. Often, the entries are symbolic links into a
directory tree that reflects the topology of available devices. The
topological tree is useful for finding how your devices are arranged.
Below is a list of the naming schemes for the most common drivers. A
list of reserved device names is
available for reference. Please send email to
rgooch@atnf.csiro.au to obtain an allocation. Please be
patient (the maintainer is busy). An alternative name may be allocated
instead of the requested name, at the discretion of the maintainer.
Disc Devices
All discs, whether SCSI, IDE or whatever, are placed under the
......@@ -1486,6 +1504,47 @@ raised the possibilty of moving network devices into the device
namespace, but have had no response.
How can I test if I have devfs compiled into my kernel?
All filesystems built-in or currently loaded are listed in
/proc/filesystems. If you see a devfs entry, then
you know that devfs was compiled into your kernel. If you have
correctly configured and rebuilt your kernel, then devfs will be
built-in. If you think you've configured it in, but
/proc/filesystems doesn't show it, you've made a mistake.
Common mistakes include:
Using a 2.2.x kernel without applying the devfs patch (if you
don't know how to patch your kernel, use 2.4.x instead, don't bother
asking me how to patch)
Forgetting to set CONFIG_EXPERIMENTAL=y
Forgetting to set CONFIG_DEVFS_FS=y
Forgetting to set CONFIG_DEVFS_MOUNT=y (if you want devfs
to be automatically mounted at boot)
Editing your .config manually, instead of using make
config or make xconfig
Forgetting to run make dep; make clean after changing the
configuration and before compiling
Forgetting to compile your kernel and modules
Forgetting to install your kernel
Forgetting to install your modules
Please check twice that you've done all these steps before sending in
a bug report.
How can I test if devfs is mounted on /dev?
The device filesystem will always create an entry called
".devfsd", which is used to communicate with the daemon. Even
if the daemon is not running, this entry will exist. Testing for the
existence of this entry is the approved method of determining if devfs
is mounted or not. Note that the type of entry (i.e. regular file,
character device, named pipe, etc.) may change without notice. Only
the existence of the entry should be relied upon.
......@@ -1716,3 +1775,20 @@ U.S.A. in October 2000.
-----------------------------------------------------------------------------
Translations of this document
This document has been translated into other languages.
A Korean translation by viatoris@nownuri.net is available at
http://home.nownuri.net/~viatoris/devfs/devfs.html
......@@ -4,7 +4,7 @@
Richard Gooch <rgooch@atnf.csiro.au>
30-APR-2000
18-AUG-2001
When CONFIG_DEVFS_DEBUG is enabled, you can pass several boot options
......@@ -19,6 +19,8 @@ load requests and device registration, you would do:
devfs=dmod,dreg
You may prefix "no" to any option. This will invert the option.
Debugging Options
=================
......@@ -42,11 +44,11 @@ dchange print device change requests to <devfs_set_flags>
dilookup print inode lookup requests
diread print inode reads
diget print VFS inode allocations
diunlink print inode unlinks
diwrite print inode writes
dichange print inode changes
dimknod print calls to mknod(2)
......@@ -58,10 +60,6 @@ Other Options
These control the default behaviour of devfs. The options are:
show show unregistered devices by default
mount mount devfs onto /dev at boot time
nomount do not mount devfs onto /dev at boot time
only disable non-devfs device nodes for devfs-capable drivers
VERSION = 2
PATCHLEVEL = 4
SUBLEVEL = 10
EXTRAVERSION =-pre13
EXTRAVERSION =-pre14
KERNELRELEASE=$(VERSION).$(PATCHLEVEL).$(SUBLEVEL)$(EXTRAVERSION)
......@@ -202,6 +202,7 @@ CLEAN_FILES = \
drivers/scsi/aic7xxx/aicasm/aicasm_scan.c \
drivers/scsi/aic7xxx/aicasm/y.tab.h \
drivers/scsi/aic7xxx/aicasm/aicasm \
drivers/scsi/53c700-mem.c \
net/khttpd/make_times_h \
net/khttpd/times.h \
submenu*
......
......@@ -2253,11 +2253,9 @@ int __init mtrr_init(void)
proc_root_mtrr->proc_fops = &mtrr_fops;
}
#endif
#ifdef CONFIG_DEVFS_FS
devfs_handle = devfs_register (NULL, "cpu/mtrr", DEVFS_FL_DEFAULT, 0, 0,
S_IFREG | S_IRUGO | S_IWUSR,
&mtrr_fops, NULL);
#endif
init_table ();
return 0;
} /* End Function mtrr_init */
......
......@@ -100,7 +100,7 @@ static int rd_hardsec[NUM_RAMDISKS]; /* Size of real blocks in bytes */
static int rd_blocksizes[NUM_RAMDISKS]; /* Size of 1024 byte blocks :) */
static int rd_kbsize[NUM_RAMDISKS]; /* Size in blocks of 1024 bytes */
static devfs_handle_t devfs_handle;
static struct inode *rd_inode[NUM_RAMDISKS]; /* Protected device inodes */
static struct block_device *rd_bdev[NUM_RAMDISKS];/* Protected device data */
/*
* Parameters for the boot-loading of the RAM disk. These are set by
......@@ -186,17 +186,67 @@ __setup("ramdisk_blocksize=", ramdisk_blocksize);
#endif
/*
* Copyright (C) 2000 Linus Torvalds.
* 2000 Transmeta Corp.
* aops copied from ramfs.
*/
static int ramdisk_readpage(struct file *file, struct page * page)
{
if (!Page_Uptodate(page)) {
memset(kmap(page), 0, PAGE_CACHE_SIZE);
kunmap(page);
flush_dcache_page(page);
SetPageUptodate(page);
}
UnlockPage(page);
return 0;
}
/*
* Writing: just make sure the page gets marked dirty, so that
* the page stealer won't grab it.
*/
static int ramdisk_writepage(struct page *page)
{
SetPageDirty(page);
UnlockPage(page);
return 0;
}
static int ramdisk_prepare_write(struct file *file, struct page *page, unsigned offset, unsigned to)
{
if (!Page_Uptodate(page)) {
void *addr = page_address(page);
memset(addr, 0, PAGE_CACHE_SIZE);
flush_dcache_page(page);
SetPageUptodate(page);
}
SetPageDirty(page);
return 0;
}
static int ramdisk_commit_write(struct file *file, struct page *page, unsigned offset, unsigned to)
{
return 0;
}
static struct address_space_operations ramdisk_aops = {
readpage: ramdisk_readpage,
writepage: ramdisk_writepage,
prepare_write: ramdisk_prepare_write,
commit_write: ramdisk_commit_write,
};
static int rd_blkdev_pagecache_IO(int rw, struct buffer_head * sbh, int minor)
{
struct address_space * mapping = rd_inode[minor]->i_mapping;
struct address_space * mapping;
unsigned long index;
int offset, size, err = 0;
int offset, size, err;
if (sbh->b_page->mapping == mapping) {
if (rw != READ)
SetPageDirty(sbh->b_page);
goto out;
}
err = -EIO;
err = 0;
mapping = rd_bdev[minor]->bd_inode->i_mapping;
index = sbh->b_rsector >> (PAGE_CACHE_SHIFT - 9);
offset = (sbh->b_rsector << 9) & ~PAGE_CACHE_MASK;
......@@ -206,8 +256,7 @@ static int rd_blkdev_pagecache_IO(int rw, struct buffer_head * sbh, int minor)
int count;
struct page ** hash;
struct page * page;
const char * src;
char * dst;
char * src, * dst;
int unlock = 0;
count = PAGE_CACHE_SIZE - offset;
......@@ -217,20 +266,24 @@ static int rd_blkdev_pagecache_IO(int rw, struct buffer_head * sbh, int minor)
hash = page_hash(mapping, index);
page = __find_get_page(mapping, index, hash);
if (!page && rw != READ) {
if (!page) {
page = grab_cache_page(mapping, index);
err = -ENOMEM;
if (!page)
goto out;
err = 0;
if (!Page_Uptodate(page)) {
memset(kmap(page), 0, PAGE_CACHE_SIZE);
kunmap(page);
flush_dcache_page(page);
SetPageUptodate(page);
}
unlock = 1;
}
index++;
if (!page) {
offset = 0;
continue;
}
if (rw == READ) {
src = kmap(page);
......@@ -303,10 +356,11 @@ static int rd_make_request(request_queue_t * q, int rw, struct buffer_head *sbh)
static int rd_ioctl(struct inode *inode, struct file *file, unsigned int cmd, unsigned long arg)
{
int error = -EINVAL;
unsigned int minor;
if (!inode || !inode->i_rdev)
return -EINVAL;
goto out;
minor = MINOR(inode->i_rdev);
......@@ -317,40 +371,29 @@ static int rd_ioctl(struct inode *inode, struct file *file, unsigned int cmd, un
/* special: we want to release the ramdisk memory,
it's not like with the other blockdevices where
this ioctl only flushes away the buffer cache. */
{
struct block_device * bdev = inode->i_bdev;
down(&bdev->bd_sem);
if (bdev->bd_openers > 2) {
up(&bdev->bd_sem);
return -EBUSY;
}
bdev->bd_openers--;
bdev->bd_cache_openers--;
iput(rd_inode[minor]);
rd_inode[minor] = NULL;
rd_blocksizes[minor] = rd_blocksize;
up(&bdev->bd_sem);
error = -EBUSY;
down(&inode->i_bdev->bd_sem);
if (inode->i_bdev->bd_openers <= 2) {
truncate_inode_pages(inode->i_mapping, 0);
error = 0;
}
up(&inode->i_bdev->bd_sem);
break;
case BLKGETSIZE: /* Return device size */
if (!arg) return -EINVAL;
return put_user(rd_kbsize[minor] << 1, (long *) arg);
if (!arg)
break;
error = put_user(rd_kbsize[minor] << 1, (long *) arg);
break;
case BLKGETSIZE64:
return put_user((u64)rd_kbsize[minor] << 10, (u64*)arg);
error = put_user((u64)rd_kbsize[minor]<<10, (u64*)arg);
break;
case BLKROSET:
case BLKROGET:
case BLKSSZGET:
return blk_ioctl(inode->i_rdev, cmd, arg);
default:
return -EINVAL;
error = blk_ioctl(inode->i_rdev, cmd, arg);
};
return 0;
out:
return error;
}
......@@ -377,6 +420,8 @@ static int initrd_release(struct inode *inode,struct file *file)
if (!--initrd_users) {
free_initrd_mem(initrd_start, initrd_end);
initrd_start = 0;
inode->i_bdev->bd_cache_openers--;
blkdev_put(inode->i_bdev, BDEV_FILE);
}
return 0;
}
......@@ -384,6 +429,7 @@ static int initrd_release(struct inode *inode,struct file *file)
static struct file_operations initrd_fops = {
read: initrd_read,
release: initrd_release,
};
#endif
......@@ -391,36 +437,28 @@ static struct file_operations initrd_fops = {
static int rd_open(struct inode * inode, struct file * filp)
{
#ifdef CONFIG_BLK_DEV_INITRD
if (DEVICE_NR(inode->i_rdev) == INITRD_MINOR) {
static struct block_device_operations initrd_bd_op = {
open: rd_open,
release: initrd_release,
};
int unit = DEVICE_NR(inode->i_rdev);
#ifdef CONFIG_BLK_DEV_INITRD
if (unit == INITRD_MINOR) {
if (!initrd_start) return -ENODEV;
initrd_users++;
filp->f_op = &initrd_fops;
inode->i_bdev->bd_op = &initrd_bd_op;
return 0;
}
#endif
if (DEVICE_NR(inode->i_rdev) >= NUM_RAMDISKS)
if (unit >= NUM_RAMDISKS)
return -ENXIO;
/*
* Immunize device against invalidate_buffers() and prune_icache().
*/
if (rd_inode[DEVICE_NR(inode->i_rdev)] == NULL) {
if (!inode->i_bdev) return -ENXIO;
if ((rd_inode[DEVICE_NR(inode->i_rdev)] = igrab(inode)) != NULL) {
struct block_device *bdev = inode->i_bdev;
/* bdev->bd_sem is held by caller */
bdev->bd_openers++;
bdev->bd_cache_openers++;
}
if (rd_bdev[unit] == NULL) {
rd_bdev[unit] = bdget(kdev_t_to_nr(inode->i_rdev));
rd_bdev[unit]->bd_openers++;
rd_bdev[unit]->bd_cache_openers++;
rd_bdev[unit]->bd_inode->i_mapping->a_ops = &ramdisk_aops;
}
MOD_INC_USE_COUNT;
......@@ -447,18 +485,13 @@ static void __exit rd_cleanup (void)
int i;
for (i = 0 ; i < NUM_RAMDISKS; i++) {
if (rd_inode[i]) {
/* withdraw invalidate_buffers() and prune_icache() immunity */
struct block_device *bdev = rd_inode[i]->i_bdev;
down(&bdev->bd_sem);
bdev->bd_openers--;
struct block_device *bdev = rd_bdev[i];
rd_bdev[i] = NULL;
if (bdev) {
bdev->bd_cache_openers--;
up(&bdev->bd_sem);
/* remove stale pointer to module address space */
rd_inode[i]->i_bdev->bd_op = NULL;
iput(rd_inode[i]);
truncate_inode_pages(bdev->bd_inode->i_mapping, 0);
blkdev_put(bdev, BDEV_FILE);
bdput(bdev);
}
destroy_buffers(MKDEV(MAJOR_NR, i));
}
......@@ -777,7 +810,7 @@ static void __init rd_load_image(kdev_t device, int offset, int unit)
if (ROOT_DEVICE_NAME != NULL) strcpy (ROOT_DEVICE_NAME, "rd/0");
done:
blkdev_close(inode, &infile);
infile.f_op->release(inode, &infile);
noclose_input:
blkdev_close(out_inode, &outfile);
iput(inode);
......@@ -786,7 +819,7 @@ static void __init rd_load_image(kdev_t device, int offset, int unit)
return;
free_inodes: /* free inodes on error */
iput(out_inode);
blkdev_close(inode, &infile);
infile.f_op->release(inode, &infile);
free_inode:
iput(inode);
}
......
......@@ -334,7 +334,8 @@ static int pty_open(struct tty_struct *tty, struct file * filp)
/* Register a slave for the master */
if (tty->driver.major == PTY_MASTER_MAJOR)
tty_register_devfs(&tty->link->driver,
DEVFS_FL_AUTO_OWNER | DEVFS_FL_WAIT,
DEVFS_FL_CURRENT_OWNER |
DEVFS_FL_NO_PERSISTENCE | DEVFS_FL_WAIT,
tty->link->driver.minor_start +
MINOR(tty->device)-tty->driver.minor_start);
retval = 0;
......
......@@ -2186,7 +2186,11 @@ int __init rp_init(void)
*/
memset(&rocket_driver, 0, sizeof(struct tty_driver));
rocket_driver.magic = TTY_DRIVER_MAGIC;
#ifdef CONFIG_DEVFS_FS
rocket_driver.name = "tts/R%d";
#else
rocket_driver.name = "ttyR";
#endif
rocket_driver.major = TTY_ROCKET_MAJOR;
rocket_driver.minor_start = 0;
rocket_driver.num = MAX_RP_PORTS;
......@@ -2228,7 +2232,11 @@ int __init rp_init(void)
* the minor number and the subtype code.
*/
callout_driver = rocket_driver;
#ifdef CONFIG_DEVFS_FS
callout_driver.name = "cua/R%d";
#else
callout_driver.name = "cur";
#endif
callout_driver.major = CUA_ROCKET_MAJOR;
callout_driver.minor_start = 0;
callout_driver.subtype = SERIAL_TYPE_CALLOUT;
......
......@@ -139,8 +139,13 @@ static int stl_nrbrds = sizeof(stl_brdconf) / sizeof(stlconf_t);
static char *stl_drvtitle = "Stallion Multiport Serial Driver";
static char *stl_drvname = "stallion";
static char *stl_drvversion = "5.6.0";
#ifdef CONFIG_DEVFS_FS
static char *stl_serialname = "tts/E%d";
static char *stl_calloutname = "cua/E%d";
#else
static char *stl_serialname = "ttyE";
static char *stl_calloutname = "cue";
#endif
static struct tty_driver stl_serial;
static struct tty_driver stl_callout;
......
......@@ -2021,7 +2021,7 @@ void tty_register_devfs (struct tty_driver *driver, unsigned int flags, unsigned
break;
default:
if (driver->major == PTY_MASTER_MAJOR)
flags |= DEVFS_FL_AUTO_OWNER;
mode |= S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH;
break;
}
if ( (minor < driver->minor_start) ||
......
/* WARNING: GENERATED FILE (from 53c700.c), DO NOT MODIFY */
#define MEM_MAPPED
/* -*- mode: c; c-basic-offset: 8 -*- */
/* NCR (or Symbios) 53c700 and 53c700-66 Driver
*
* Copyright (C) 2001 by James.Bottomley@HansenPartnership.com
**-----------------------------------------------------------------------------
**
** This program is free software; you can redistribute it and/or modify
** it under the terms of the GNU General Public License as published by
** the Free Software Foundation; either version 2 of the License, or
** (at your option) any later version.
**
** This program is distributed in the hope that it will be useful,
** but WITHOUT ANY WARRANTY; without even the implied warranty of
** MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
** GNU General Public License for more details.
**
** You should have received a copy of the GNU General Public License
** along with this program; if not, write to the Free Software
** Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
**
**-----------------------------------------------------------------------------
*/
/* Notes:
*
* This driver is designed exclusively for these chips (virtually the
* earliest of the scripts engine chips). They need their own drivers
* because they are missing so many of the scripts and snazzy register
* features of their elder brothers (the 710, 720 and 770).
*
* The 700 is the lowliest of the line, it can only do async SCSI.
* The 700-66 can at least do synchronous SCSI up to 10MHz.
*
* The 700 chip has no host bus interface logic of its own. However,
* it is usually mapped to a location with well defined register
* offsets. Therefore, if you can determine the base address and the
* irq your board incorporating this chip uses, you can probably use
* this driver to run it (although you'll probably have to write a
* minimal wrapper for the purpose---see the NCR_D700 driver for
* details about how to do this).
*
*
* TODO List:
*
* 1. Better statistics in the proc fs
*
* 2. Implement message queue (queues SCSI messages like commands) and make
* the abort and device reset functions use them.
* */
/* CHANGELOG
*
* Version 2.3
*
* More endianness/cache coherency changes.
*
* Better bad device handling (handles devices lying about tag
* queueing support and devices which fail to provide sense data on
* contingent allegiance conditions)
*
* Many thanks to Richard Hirst <rhirst@linuxcare.com> for patiently
* debugging this driver on the parisc architecture and suggesting
* many improvements and bug fixes.
*
* Thanks also go to Linuxcare Inc. for providing several PARISC
* machines for me to debug the driver on.
*
* Version 2.2
*
* Made the driver mem or io mapped; added endian invariance; added
* dma cache flushing operations for architectures which need it;
* added support for more varied clocking speeds.
*
* Version 2.1
*
* Initial modularisation from the D700. See NCR_D700.c for the rest of
* the changelog.
* */
#define NCR_700_VERSION "2.3"
#include <linux/config.h>
#include <linux/version.h>
#include <linux/kernel.h>
#include <linux/types.h>
#include <linux/string.h>
#include <linux/ioport.h>
#include <linux/delay.h>
#include <linux/spinlock.h>
#include <linux/sched.h>
#include <linux/proc_fs.h>
#include <linux/init.h>
#include <linux/mca.h>
#include <asm/dma.h>
#include <asm/system.h>
#include <asm/io.h>
#include <asm/pgtable.h>
#include <asm/byteorder.h>
#include <linux/blk.h>
#include <linux/module.h>
#include "scsi.h"
#include "hosts.h"
#include "constants.h"
#include "53c700.h"
#ifdef NCR_700_DEBUG
#define STATIC
#else
#define STATIC static
#endif
MODULE_AUTHOR("James Bottomley");
MODULE_DESCRIPTION("53c700 and 53c700-66 Driver");
MODULE_LICENSE("GPL");
/* This is the script */
#include "53c700_d.h"
STATIC int NCR_700_queuecommand(Scsi_Cmnd *, void (*done)(Scsi_Cmnd *));
STATIC int NCR_700_abort(Scsi_Cmnd * SCpnt);
STATIC int NCR_700_bus_reset(Scsi_Cmnd * SCpnt);
STATIC int NCR_700_dev_reset(Scsi_Cmnd * SCpnt);
STATIC int NCR_700_host_reset(Scsi_Cmnd * SCpnt);
STATIC int NCR_700_proc_directory_info(char *, char **, off_t, int, int, int);
STATIC void NCR_700_chip_setup(struct Scsi_Host *host);
STATIC void NCR_700_chip_reset(struct Scsi_Host *host);
static char *NCR_700_phase[] = {
"",
"after selection",
"before command phase",
"after command phase",
"after status phase",
"after data in phase",
"after data out phase",
"during data phase",
};
static char *NCR_700_condition[] = {
"",
"NOT MSG_OUT",
"UNEXPECTED PHASE",
"NOT MSG_IN",
"UNEXPECTED MSG",
"MSG_IN",
"SDTR_MSG RECEIVED",
"REJECT_MSG RECEIVED",
"DISCONNECT_MSG RECEIVED",
"MSG_OUT",
"DATA_IN",
};
static char *NCR_700_fatal_messages[] = {
"unexpected message after reselection",
"still MSG_OUT after message injection",
"not MSG_IN after selection",
"Illegal message length received",
};
static char *NCR_700_SBCL_bits[] = {
"IO ",
"CD ",
"MSG ",
"ATN ",
"SEL ",
"BSY ",
"ACK ",
"REQ ",
};
static char *NCR_700_SBCL_to_phase[] = {
"DATA_OUT",
"DATA_IN",
"CMD_OUT",
"STATE",
"ILLEGAL PHASE",
"ILLEGAL PHASE",
"MSG OUT",
"MSG IN",
};
static __u8 NCR_700_SDTR_msg[] = {
0x01, /* Extended message */
0x03, /* Extended message Length */
0x01, /* SDTR Extended message */
NCR_700_MIN_PERIOD,
NCR_700_MAX_OFFSET
};
struct Scsi_Host * __init
NCR_700_detect(Scsi_Host_Template *tpnt,
struct NCR_700_Host_Parameters *hostdata)
{
__u32 *script = kmalloc(sizeof(SCRIPT), GFP_KERNEL);
__u32 pScript;
struct Scsi_Host *host;
static int banner = 0;
int j;
/* Fill in the missing routines from the host template */
tpnt->queuecommand = NCR_700_queuecommand;
tpnt->eh_abort_handler = NCR_700_abort;
tpnt->eh_device_reset_handler = NCR_700_dev_reset;
tpnt->eh_bus_reset_handler = NCR_700_bus_reset;
tpnt->eh_host_reset_handler = NCR_700_host_reset;
tpnt->can_queue = NCR_700_COMMAND_SLOTS_PER_HOST;
tpnt->sg_tablesize = NCR_700_SG_SEGMENTS;
tpnt->cmd_per_lun = NCR_700_MAX_TAGS;
tpnt->use_clustering = DISABLE_CLUSTERING;
tpnt->use_new_eh_code = 1;
tpnt->proc_info = NCR_700_proc_directory_info;
if(tpnt->name == NULL)
tpnt->name = "53c700";
if(tpnt->proc_name == NULL)
tpnt->proc_name = "53c700";
if((host = scsi_register(tpnt, 4)) == NULL)
return NULL;
if(script == NULL) {
printk(KERN_ERR "53c700: Failed to allocate script, detatching\n");
scsi_unregister(host);
return NULL;
}
hostdata->slots = kmalloc(sizeof(struct NCR_700_command_slot) * NCR_700_COMMAND_SLOTS_PER_HOST, GFP_KERNEL);
if(hostdata->slots == NULL) {
printk(KERN_ERR "53c700: Failed to allocate command slots, detatching\n");
scsi_unregister(host);
return NULL;
}
memset(hostdata->slots, 0, sizeof(struct NCR_700_command_slot) * NCR_700_COMMAND_SLOTS_PER_HOST);
for(j = 0; j < NCR_700_COMMAND_SLOTS_PER_HOST; j++) {
if(j == 0)
hostdata->free_list = &hostdata->slots[j];
else
hostdata->slots[j-1].ITL_forw = &hostdata->slots[j];
hostdata->slots[j].state = NCR_700_SLOT_FREE;
}
host->hostdata[0] = (__u32)hostdata;
for(j = 0; j < sizeof(SCRIPT)/sizeof(SCRIPT[0]); j++) {
script[j] = bS_to_host(SCRIPT[j]);
}
/* bus physical address of script */
pScript = virt_to_bus(script);
/* adjust all labels to be bus physical */
for(j = 0; j < PATCHES; j++) {
script[LABELPATCHES[j]] = bS_to_host(pScript + SCRIPT[LABELPATCHES[j]]);
}
/* now patch up fixed addresses */
script_patch_32(script, MessageLocation,
virt_to_bus(&hostdata->msgout[0]));
script_patch_32(script, StatusAddress,
virt_to_bus(&hostdata->status));
script_patch_32(script, ReceiveMsgAddress,
virt_to_bus(&hostdata->msgin[0]));
hostdata->script = script;
hostdata->pScript = pScript;
hostdata->state = NCR_700_HOST_FREE;
spin_lock_init(&hostdata->lock);
hostdata->cmd = NULL;
host->max_id = 7;
host->max_lun = NCR_700_MAX_LUNS;
host->unique_id = hostdata->base;
host->base = hostdata->base;
host->hostdata[0] = (unsigned long)hostdata;
/* kick the chip */
NCR_700_writeb(0xff, host, CTEST9_REG);
hostdata->rev = (NCR_700_readb(host, CTEST7_REG)<<4) & 0x0f;
hostdata->fast = (NCR_700_readb(host, CTEST9_REG) == 0);
if(banner == 0) {
printk(KERN_NOTICE "53c700: Version " NCR_700_VERSION " By James.Bottomley@HansenPartnership.com\n");
banner = 1;
}
printk(KERN_NOTICE "scsi%d: %s rev %d %s\n", host->host_no,
hostdata->fast ? "53c700-66" : "53c700",
hostdata->rev, hostdata->differential ?
"(Differential)" : "");
/* reset the chip */
NCR_700_chip_reset(host);
NCR_700_writeb(ASYNC_OPERATION , host, SXFER_REG);
return host;
}
int
NCR_700_release(struct Scsi_Host *host)
{
struct NCR_700_Host_Parameters *hostdata =
(struct NCR_700_Host_Parameters *)host->hostdata[0];
kfree(hostdata->script);
return 1;
}
static inline __u8
NCR_700_identify(int can_disconnect, __u8 lun)
{
return IDENTIFY_BASE |
((can_disconnect) ? 0x40 : 0) |
(lun & NCR_700_LUN_MASK);
}
/*
* Function : static int datapath_residual (Scsi_Host *host)
*
* Purpose : return residual data count of what's in the chip. If you
* really want to know what this function is doing, it's almost a
* direct transcription of the algorithm described in the 53c710
* guide, except that the DBC and DFIFO registers are only 6 bits
* wide.
*
* Inputs : host - SCSI host */
static inline int
NCR_700_data_residual (struct Scsi_Host *host) {
int count, synchronous;
unsigned int ddir;
count = ((NCR_700_readb(host, DFIFO_REG) & 0x3f) -
(NCR_700_readl(host, DBC_REG) & 0x3f)) & 0x3f;
synchronous = NCR_700_readb(host, SXFER_REG) & 0x0f;
/* get the data direction */
ddir = NCR_700_readb(host, CTEST0_REG) & 0x01;
if (ddir) {
/* Receive */
if (synchronous)
count += (NCR_700_readb(host, SSTAT2_REG) & 0xf0) >> 4;
else
if (NCR_700_readb(host, SSTAT1_REG) & SIDL_REG_FULL)
++count;
} else {
/* Send */
__u8 sstat = NCR_700_readb(host, SSTAT1_REG);
if (sstat & SODL_REG_FULL)
++count;
if (synchronous && (sstat & SODR_REG_FULL))
++count;
}
return count;
}
/* print out the SCSI wires and corresponding phase from the SBCL register
* in the chip */
static inline char *
sbcl_to_string(__u8 sbcl)
{
int i;
static char ret[256];
ret[0]='\0';
for(i=0; i<8; i++) {
if((1<<i) & sbcl)
strcat(ret, NCR_700_SBCL_bits[i]);
}
strcat(ret, NCR_700_SBCL_to_phase[sbcl & 0x07]);
return ret;
}
static inline __u8
bitmap_to_number(__u8 bitmap)
{
__u8 i;
for(i=0; i<8 && !(bitmap &(1<<i)); i++)
;
return i;
}
/* Pull a slot off the free list */
STATIC struct NCR_700_command_slot *
find_empty_slot(struct NCR_700_Host_Parameters *hostdata)
{
struct NCR_700_command_slot *slot = hostdata->free_list;
if(slot == NULL) {
/* sanity check */
if(hostdata->command_slot_count != NCR_700_COMMAND_SLOTS_PER_HOST)
printk(KERN_ERR "SLOTS FULL, but count is %d, should be %d\n", hostdata->command_slot_count, NCR_700_COMMAND_SLOTS_PER_HOST);
return NULL;
}
if(slot->state != NCR_700_SLOT_FREE)
/* should panic! */
printk(KERN_ERR "BUSY SLOT ON FREE LIST!!!\n");
hostdata->free_list = slot->ITL_forw;
slot->ITL_forw = NULL;
/* NOTE: set the state to busy here, not queued, since this
* indicates the slot is in use and cannot be run by the IRQ
* finish routine. If we cannot queue the command when it
* is properly build, we then change to NCR_700_SLOT_QUEUED */
slot->state = NCR_700_SLOT_BUSY;
hostdata->command_slot_count++;
return slot;
}
STATIC void
free_slot(struct NCR_700_command_slot *slot,
struct NCR_700_Host_Parameters *hostdata)
{
int hash;
struct NCR_700_command_slot **forw, **back;
if((slot->state & NCR_700_SLOT_MASK) != NCR_700_SLOT_MAGIC) {
printk(KERN_ERR "53c700: SLOT %p is not MAGIC!!!\n", slot);
}
if(slot->state == NCR_700_SLOT_FREE) {
printk(KERN_ERR "53c700: SLOT %p is FREE!!!\n", slot);
}
/* remove from queues */
if(slot->tag != NCR_700_NO_TAG) {
hash = hash_ITLQ(slot->cmnd->target, slot->cmnd->lun,
slot->tag);
if(slot->ITLQ_forw == NULL)
back = &hostdata->ITLQ_Hash_back[hash];
else
back = &slot->ITLQ_forw->ITLQ_back;
if(slot->ITLQ_back == NULL)
forw = &hostdata->ITLQ_Hash_forw[hash];
else
forw = &slot->ITLQ_back->ITLQ_forw;
*forw = slot->ITLQ_forw;
*back = slot->ITLQ_back;
}
hash = hash_ITL(slot->cmnd->target, slot->cmnd->lun);
if(slot->ITL_forw == NULL)
back = &hostdata->ITL_Hash_back[hash];
else
back = &slot->ITL_forw->ITL_back;
if(slot->ITL_back == NULL)
forw = &hostdata->ITL_Hash_forw[hash];
else
forw = &slot->ITL_back->ITL_forw;
*forw = slot->ITL_forw;
*back = slot->ITL_back;
slot->resume_offset = 0;
slot->cmnd = NULL;
slot->state = NCR_700_SLOT_FREE;
slot->ITL_forw = hostdata->free_list;
hostdata->free_list = slot;
hostdata->command_slot_count--;
}
/* This routine really does very little. The command is indexed on
the ITL and (if tagged) the ITLQ lists in _queuecommand */
STATIC void
save_for_reselection(struct NCR_700_Host_Parameters *hostdata,
Scsi_Cmnd *SCp, __u32 dsp)
{
/* Its just possible that this gets executed twice */
if(SCp != NULL) {
struct NCR_700_command_slot *slot =
(struct NCR_700_command_slot *)SCp->host_scribble;
slot->resume_offset = dsp;
}
hostdata->state = NCR_700_HOST_FREE;
hostdata->cmd = NULL;
}
/* Most likely nexus is the oldest in each case */
STATIC inline struct NCR_700_command_slot *
find_ITL_Nexus(struct NCR_700_Host_Parameters *hostdata, __u8 pun, __u8 lun)
{
int hash = hash_ITL(pun, lun);
struct NCR_700_command_slot *slot = hostdata->ITL_Hash_back[hash];
while(slot != NULL && !(slot->cmnd->target == pun &&
slot->cmnd->lun == lun))
slot = slot->ITL_back;
return slot;
}
STATIC inline struct NCR_700_command_slot *
find_ITLQ_Nexus(struct NCR_700_Host_Parameters *hostdata, __u8 pun,
__u8 lun, __u8 tag)
{
int hash = hash_ITLQ(pun, lun, tag);
struct NCR_700_command_slot *slot = hostdata->ITLQ_Hash_back[hash];
while(slot != NULL && !(slot->cmnd->target == pun
&& slot->cmnd->lun == lun && slot->tag == tag))
slot = slot->ITLQ_back;
#ifdef NCR_700_TAG_DEBUG
if(slot != NULL) {
struct NCR_700_command_slot *n = slot->ITLQ_back;
while(n != NULL && n->cmnd->target != pun
&& n->cmnd->lun != lun && n->tag != tag)
n = n->ITLQ_back;
if(n != NULL && n->cmnd->target == pun && n->cmnd->lun == lun
&& n->tag == tag) {
printk(KERN_WARNING "53c700: WARNING: DUPLICATE tag %d\n",
tag);
}
}
#endif
return slot;
}
/* This translates the SDTR message offset and period to a value
* which can be loaded into the SXFER_REG.
*
* NOTE: According to SCSI-2, the true transfer period (in ns) is
* actually four times this period value */
STATIC inline __u8
NCR_700_offset_period_to_sxfer(struct NCR_700_Host_Parameters *hostdata,
__u8 offset, __u8 period)
{
int XFERP;
if(period*4 < NCR_700_MIN_PERIOD) {
printk(KERN_WARNING "53c700: Period %dns is less than SCSI-2 minimum, setting to %d\n", period*4, NCR_700_MIN_PERIOD);
period = NCR_700_MIN_PERIOD/4;
}
XFERP = (period*4 * hostdata->sync_clock)/1000 - 4;
if(offset > NCR_700_MAX_OFFSET) {
printk(KERN_WARNING "53c700: Offset %d exceeds maximum, setting to %d\n",
offset, NCR_700_MAX_OFFSET);
offset = NCR_700_MAX_OFFSET;
}
if(XFERP < NCR_700_MIN_XFERP) {
printk(KERN_WARNING "53c700: XFERP %d is less than minium, setting to %d\n",
XFERP, NCR_700_MIN_XFERP);
XFERP = NCR_700_MIN_XFERP;
}
return (offset & 0x0f) | (XFERP & 0x07)<<4;
}
STATIC inline void
NCR_700_scsi_done(struct NCR_700_Host_Parameters *hostdata,
Scsi_Cmnd *SCp, int result)
{
hostdata->state = NCR_700_HOST_FREE;
hostdata->cmd = NULL;
if(SCp != NULL) {
struct NCR_700_command_slot *slot =
(struct NCR_700_command_slot *)SCp->host_scribble;
if(SCp->cmnd[0] == REQUEST_SENSE && SCp->cmnd[6] == NCR_700_INTERNAL_SENSE_MAGIC) {
#ifdef NCR_700_DEBUG
printk(" ORIGINAL CMD %p RETURNED %d, new return is %d sense is",
SCp, SCp->cmnd[7], result);
print_sense("53c700", SCp);
#endif
if(result == 0)
result = SCp->cmnd[7];
}
free_slot(slot, hostdata);
SCp->host_scribble = NULL;
SCp->result = result;
SCp->scsi_done(SCp);
if(NCR_700_get_depth(SCp->device) == 0 ||
NCR_700_get_depth(SCp->device) > NCR_700_MAX_TAGS)
printk(KERN_ERR "Invalid depth in NCR_700_scsi_done(): %d\n",
NCR_700_get_depth(SCp->device));
NCR_700_set_depth(SCp->device, NCR_700_get_depth(SCp->device) - 1);
} else {
printk(KERN_ERR "53c700: SCSI DONE HAS NULL SCp\n");
}
}
STATIC void
NCR_700_internal_bus_reset(struct Scsi_Host *host)
{
/* Bus reset */
NCR_700_writeb(ASSERT_RST, host, SCNTL1_REG);
udelay(50);
NCR_700_writeb(0, host, SCNTL1_REG);
}
STATIC void
NCR_700_chip_setup(struct Scsi_Host *host)
{
struct NCR_700_Host_Parameters *hostdata =
(struct NCR_700_Host_Parameters *)host->hostdata[0];
NCR_700_writeb(1 << host->this_id, host, SCID_REG);
NCR_700_writeb(0, host, SBCL_REG);
NCR_700_writeb(0, host, SXFER_REG);
NCR_700_writeb(PHASE_MM_INT | SEL_TIMEOUT_INT | GROSS_ERR_INT | UX_DISC_INT
| RST_INT | PAR_ERR_INT | SELECT_INT, host, SIEN_REG);
NCR_700_writeb(ABORT_INT | INT_INST_INT | ILGL_INST_INT, host, DIEN_REG);
NCR_700_writeb(BURST_LENGTH_8, host, DMODE_REG);
NCR_700_writeb(FULL_ARBITRATION | PARITY | AUTO_ATN, host, SCNTL0_REG);
NCR_700_writeb(LAST_DIS_ENBL | ENABLE_ACTIVE_NEGATION|GENERATE_RECEIVE_PARITY,
host, CTEST8_REG);
NCR_700_writeb(ENABLE_SELECT, host, SCNTL1_REG);
if(hostdata->clock > 75) {
printk(KERN_ERR "53c700: Clock speed %dMHz is too high: 75Mhz is the maximum this chip can be driven at\n", hostdata->clock);
/* do the best we can, but the async clock will be out
* of spec: sync divider 2, async divider 3 */
DEBUG(("53c700: sync 2 async 3\n"));
NCR_700_writeb(SYNC_DIV_2_0, host, SBCL_REG);
NCR_700_writeb(ASYNC_DIV_3_0, host, DCNTL_REG);
hostdata->sync_clock = hostdata->clock/2;
} else if(hostdata->clock > 50 && hostdata->clock <= 75) {
/* sync divider 1.5, async divider 3 */
DEBUG(("53c700: sync 1.5 async 3\n"));
NCR_700_writeb(SYNC_DIV_1_5, host, SBCL_REG);
NCR_700_writeb(ASYNC_DIV_3_0, host, DCNTL_REG);
hostdata->sync_clock = hostdata->clock*2;
hostdata->sync_clock /= 3;
} else if(hostdata->clock > 37 && hostdata->clock <= 50) {
/* sync divider 1, async divider 2 */
DEBUG(("53c700: sync 1 async 2\n"));
NCR_700_writeb(SYNC_DIV_1_0, host, SBCL_REG);
NCR_700_writeb(ASYNC_DIV_2_0, host, DCNTL_REG);
hostdata->sync_clock = hostdata->clock;
} else if(hostdata->clock > 25 && hostdata->clock <=37) {
/* sync divider 1, async divider 1.5 */
DEBUG(("53c700: sync 1 async 1.5\n"));
NCR_700_writeb(SYNC_DIV_1_0, host, SBCL_REG);
NCR_700_writeb(ASYNC_DIV_1_5, host, DCNTL_REG);
hostdata->sync_clock = hostdata->clock;
} else {
DEBUG(("53c700: sync 1 async 1\n"));
NCR_700_writeb(SYNC_DIV_1_0, host, SBCL_REG);
NCR_700_writeb(ASYNC_DIV_1_0, host, DCNTL_REG);
/* sync divider 1, async divider 1 */
}
}
STATIC void
NCR_700_chip_reset(struct Scsi_Host *host)
{
/* Chip reset */
NCR_700_writeb(SOFTWARE_RESET, host, DCNTL_REG);
udelay(100);
NCR_700_writeb(0, host, DCNTL_REG);
mdelay(1000);
NCR_700_chip_setup(host);
}
/* The heart of the message processing engine is that the instruction
* immediately after the INT is the normal case (and so must be CLEAR
* ACK). If we want to do something else, we call that routine in
* scripts and set temp to be the normal case + 8 (skipping the CLEAR
* ACK) so that the routine returns correctly to resume its activity
* */
STATIC __u32
process_extended_message(struct Scsi_Host *host,
struct NCR_700_Host_Parameters *hostdata,
Scsi_Cmnd *SCp, __u32 dsp, __u32 dsps)
{
__u32 resume_offset = dsp, temp = dsp + 8;
__u8 pun = 0xff, lun = 0xff;
if(SCp != NULL) {
pun = SCp->target;
lun = SCp->lun;
}
switch(hostdata->msgin[2]) {
case A_SDTR_MSG:
if(SCp != NULL && NCR_700_is_flag_set(SCp->device, NCR_700_DEV_BEGIN_SYNC_NEGOTIATION)) {
__u8 period = hostdata->msgin[3];
__u8 offset = hostdata->msgin[4];
__u8 sxfer;
if(offset != 0 && period != 0)
sxfer = NCR_700_offset_period_to_sxfer(hostdata, offset, period);
else
sxfer = 0;
if(sxfer != NCR_700_get_SXFER(SCp->device)) {
printk(KERN_INFO "scsi%d: (%d:%d) Synchronous at offset %d, period %dns\n",
host->host_no, pun, lun,
offset, period*4);
NCR_700_set_SXFER(SCp->device, sxfer);
}
NCR_700_set_flag(SCp->device, NCR_700_DEV_NEGOTIATED_SYNC);
NCR_700_clear_flag(SCp->device, NCR_700_DEV_BEGIN_SYNC_NEGOTIATION);
NCR_700_writeb(NCR_700_get_SXFER(SCp->device),
host, SXFER_REG);
} else {
/* SDTR message out of the blue, reject it */
printk(KERN_WARNING "scsi%d Unexpected SDTR msg\n",
host->host_no);
hostdata->msgout[0] = A_REJECT_MSG;
dma_cache_wback((unsigned long)hostdata->msgout, sizeof(hostdata->msgout));
script_patch_16(hostdata->script, MessageCount, 1);
/* SendMsgOut returns, so set up the return
* address */
resume_offset = hostdata->pScript + Ent_SendMessageWithATN;
}
break;
case A_WDTR_MSG:
printk(KERN_INFO "scsi%d: (%d:%d), Unsolicited WDTR after CMD, Rejecting\n",
host->host_no, pun, lun);
hostdata->msgout[0] = A_REJECT_MSG;
dma_cache_wback((unsigned long)hostdata->msgout, sizeof(hostdata->msgout));
script_patch_16(hostdata->script, MessageCount, 1);
resume_offset = hostdata->pScript + Ent_SendMessageWithATN;
break;
default:
printk(KERN_INFO "scsi%d (%d:%d): Unexpected message %s: ",
host->host_no, pun, lun,
NCR_700_phase[(dsps & 0xf00) >> 8]);
print_msg(hostdata->msgin);
printk("\n");
/* just reject it */
hostdata->msgout[0] = A_REJECT_MSG;
dma_cache_wback((unsigned long)hostdata->msgout, sizeof(hostdata->msgout));
script_patch_16(hostdata->script, MessageCount, 1);
/* SendMsgOut returns, so set up the return
* address */
resume_offset = hostdata->pScript + Ent_SendMessageWithATN;
}
NCR_700_writel(temp, host, TEMP_REG);
return resume_offset;
}
STATIC __u32
process_message(struct Scsi_Host *host, struct NCR_700_Host_Parameters *hostdata,
Scsi_Cmnd *SCp, __u32 dsp, __u32 dsps)
{
/* work out where to return to */
__u32 temp = dsp + 8, resume_offset = dsp;
__u8 pun = 0xff, lun = 0xff;
dma_cache_inv((unsigned long)hostdata->msgin, sizeof(hostdata->msgin));
if(SCp != NULL) {
pun = SCp->target;
lun = SCp->lun;
}
#ifdef NCR_700_DEBUG
printk("scsi%d (%d:%d): message %s: ", host->host_no, pun, lun,
NCR_700_phase[(dsps & 0xf00) >> 8]);
print_msg(hostdata->msgin);
printk("\n");
#endif
switch(hostdata->msgin[0]) {
case A_EXTENDED_MSG:
return process_extended_message(host, hostdata, SCp,
dsp, dsps);
case A_REJECT_MSG:
if(SCp != NULL && NCR_700_is_flag_set(SCp->device, NCR_700_DEV_BEGIN_SYNC_NEGOTIATION)) {
/* Rejected our sync negotiation attempt */
NCR_700_set_SXFER(SCp->device, 0);
NCR_700_set_flag(SCp->device, NCR_700_DEV_NEGOTIATED_SYNC);
NCR_700_clear_flag(SCp->device, NCR_700_DEV_BEGIN_SYNC_NEGOTIATION);
} else if(SCp != NULL && NCR_700_is_flag_set(SCp->device, NCR_700_DEV_BEGIN_TAG_QUEUEING)) {
/* rejected our first simple tag message */
printk(KERN_WARNING "scsi%d (%d:%d) Rejected first tag queue attempt, turning off tag queueing\n", host->host_no, pun, lun);
NCR_700_clear_flag(SCp->device, NCR_700_DEV_BEGIN_TAG_QUEUEING);
hostdata->tag_negotiated &= ~(1<<SCp->target);
} else {
printk(KERN_WARNING "scsi%d (%d:%d) Unexpected REJECT Message %s\n",
host->host_no, pun, lun,
NCR_700_phase[(dsps & 0xf00) >> 8]);
/* however, just ignore it */
}
break;
case A_PARITY_ERROR_MSG:
printk(KERN_ERR "scsi%d (%d:%d) Parity Error!\n", host->host_no,
pun, lun);
NCR_700_internal_bus_reset(host);
break;
case A_SIMPLE_TAG_MSG:
printk(KERN_INFO "scsi%d (%d:%d) SIMPLE TAG %d %s\n", host->host_no,
pun, lun, hostdata->msgin[1],
NCR_700_phase[(dsps & 0xf00) >> 8]);
/* just ignore it */
break;
default:
printk(KERN_INFO "scsi%d (%d:%d): Unexpected message %s: ",
host->host_no, pun, lun,
NCR_700_phase[(dsps & 0xf00) >> 8]);
print_msg(hostdata->msgin);
printk("\n");
/* just reject it */
hostdata->msgout[0] = A_REJECT_MSG;
dma_cache_wback((unsigned long)hostdata->msgout, sizeof(hostdata->msgout));
script_patch_16(hostdata->script, MessageCount, 1);
/* SendMsgOut returns, so set up the return
* address */
resume_offset = hostdata->pScript + Ent_SendMessageWithATN;
break;
}
NCR_700_writel(temp, host, TEMP_REG);
return resume_offset;
}
STATIC __u32
process_script_interrupt(__u32 dsps, __u32 dsp, Scsi_Cmnd *SCp,
struct Scsi_Host *host,
struct NCR_700_Host_Parameters *hostdata)
{
__u32 resume_offset = 0;
__u8 pun = 0xff, lun=0xff;
if(SCp != NULL) {
pun = SCp->target;
lun = SCp->lun;
}
if(dsps == A_GOOD_STATUS_AFTER_STATUS) {
dma_cache_inv((unsigned long)hostdata->status, sizeof(hostdata->status));
DEBUG((" COMMAND COMPLETE, status=%02x\n",
hostdata->status));
/* OK, if TCQ still on, we know it works */
NCR_700_clear_flag(SCp->device, NCR_700_DEV_BEGIN_TAG_QUEUEING);
/* check for contingent allegiance contitions */
if(status_byte(hostdata->status) == CHECK_CONDITION ||
status_byte(hostdata->status) == COMMAND_TERMINATED) {
struct NCR_700_command_slot *slot =
(struct NCR_700_command_slot *)SCp->host_scribble;
if(SCp->cmnd[0] == REQUEST_SENSE) {
/* OOPS: bad device, returning another
* contingent allegiance condition */
printk(KERN_ERR "scsi%d (%d:%d) broken device is looping in contingent allegiance: ignoring\n", host->host_no, pun, lun);
NCR_700_scsi_done(hostdata, SCp, hostdata->status);
} else {
DEBUG((" cmd %p has status %d, requesting sense\n",
SCp, hostdata->status));
/* we can destroy the command here because the
* contingent allegiance condition will cause a
* retry which will re-copy the command from the
* saved data_cmnd */
SCp->cmnd[0] = REQUEST_SENSE;
SCp->cmnd[1] = (SCp->lun & 0x7) << 5;
SCp->cmnd[2] = 0;
SCp->cmnd[3] = 0;
SCp->cmnd[4] = sizeof(SCp->sense_buffer);
SCp->cmnd[5] = 0;
SCp->cmd_len = 6;
/* Here's a quiet hack: the REQUEST_SENSE command is
* six bytes, so store a flag indicating that this
* was an internal sense request and the original
* status at the end of the command */
SCp->cmnd[6] = NCR_700_INTERNAL_SENSE_MAGIC;
SCp->cmnd[7] = hostdata->status;
slot->SG[0].ins = bS_to_host(SCRIPT_MOVE_DATA_IN | sizeof(SCp->sense_buffer));
slot->SG[0].pAddr = bS_to_host(virt_to_bus(SCp->sense_buffer));
slot->SG[1].ins = bS_to_host(SCRIPT_RETURN);
slot->SG[1].pAddr = 0;
slot->resume_offset = hostdata->pScript;
dma_cache_wback((unsigned long)slot->SG, sizeof(slot->SG[0])*2);
dma_cache_inv((unsigned long)SCp->sense_buffer, sizeof(SCp->sense_buffer));
/* queue the command for reissue */
slot->state = NCR_700_SLOT_QUEUED;
hostdata->state = NCR_700_HOST_FREE;
hostdata->cmd = NULL;
}
} else {
if(status_byte(hostdata->status) == GOOD &&
SCp->cmnd[0] == INQUIRY && SCp->use_sg == 0) {
/* Piggy back the tag queueing support
* on this command */
if(((char *)SCp->request_buffer)[7] & 0x02) {
printk(KERN_INFO "scsi%d: (%d:%d) Enabling Tag Command Queuing\n", host->host_no, pun, lun);
hostdata->tag_negotiated |= (1<<SCp->target);
NCR_700_set_flag(SCp->device, NCR_700_DEV_BEGIN_TAG_QUEUEING);
} else {
NCR_700_clear_flag(SCp->device, NCR_700_DEV_BEGIN_TAG_QUEUEING);
hostdata->tag_negotiated &= ~(1<<SCp->target);
}
}
NCR_700_scsi_done(hostdata, SCp, hostdata->status);
}
} else if((dsps & 0xfffff0f0) == A_UNEXPECTED_PHASE) {
__u8 i = (dsps & 0xf00) >> 8;
printk(KERN_ERR "scsi%d: (%d:%d), UNEXPECTED PHASE %s (%s)\n",
host->host_no, pun, lun,
NCR_700_phase[i],
sbcl_to_string(NCR_700_readb(host, SBCL_REG)));
printk(KERN_ERR " len = %d, cmd =", SCp->cmd_len);
print_command(SCp->cmnd);
NCR_700_internal_bus_reset(host);
} else if((dsps & 0xfffff000) == A_FATAL) {
int i = (dsps & 0xfff);
printk(KERN_ERR "scsi%d: (%d:%d) FATAL ERROR: %s\n",
host->host_no, pun, lun, NCR_700_fatal_messages[i]);
if(dsps == A_FATAL_ILLEGAL_MSG_LENGTH) {
printk(KERN_ERR " msg begins %02x %02x\n",
hostdata->msgin[0], hostdata->msgin[1]);
}
NCR_700_internal_bus_reset(host);
} else if((dsps & 0xfffff0f0) == A_DISCONNECT) {
#ifdef NCR_700_DEBUG
__u8 i = (dsps & 0xf00) >> 8;
printk("scsi%d: (%d:%d), DISCONNECTED (%d) %s\n",
host->host_no, pun, lun,
i, NCR_700_phase[i]);
#endif
save_for_reselection(hostdata, SCp, dsp);
} else if(dsps == A_RESELECTION_IDENTIFIED) {
__u8 lun;
struct NCR_700_command_slot *slot;
__u8 reselection_id = hostdata->reselection_id;
dma_cache_inv((unsigned long)hostdata->msgin, sizeof(hostdata->msgin));
lun = hostdata->msgin[0] & 0x1f;
hostdata->reselection_id = 0xff;
DEBUG(("scsi%d: (%d:%d) RESELECTED!\n",
host->host_no, reselection_id, lun));
/* clear the reselection indicator */
if(hostdata->msgin[1] == A_SIMPLE_TAG_MSG) {
slot = find_ITLQ_Nexus(hostdata, reselection_id,
lun, hostdata->msgin[2]);
} else {
slot = find_ITL_Nexus(hostdata, reselection_id, lun);
}
retry:
if(slot == NULL) {
struct NCR_700_command_slot *s = find_ITL_Nexus(hostdata, reselection_id, lun);
printk(KERN_ERR "scsi%d: (%d:%d) RESELECTED but no saved command (MSG = %02x %02x %02x)!!\n",
host->host_no, reselection_id, lun,
hostdata->msgin[0], hostdata->msgin[1],
hostdata->msgin[2]);
printk(KERN_ERR " OUTSTANDING TAGS:");
while(s != NULL) {
if(s->cmnd->target == reselection_id &&
s->cmnd->lun == lun) {
printk("%d ", s->tag);
if(s->tag == hostdata->msgin[2]) {
printk(" ***FOUND*** \n");
slot = s;
goto retry;
}
}
s = s->ITL_back;
}
printk("\n");
} else {
if(hostdata->state != NCR_700_HOST_BUSY)
printk(KERN_ERR "scsi%d: FATAL, host not busy during valid reselection!\n",
host->host_no);
resume_offset = slot->resume_offset;
hostdata->cmd = slot->cmnd;
/* re-patch for this command */
script_patch_32_abs(hostdata->script, CommandAddress,
virt_to_bus(slot->cmnd->cmnd));
script_patch_16(hostdata->script,
CommandCount, slot->cmnd->cmd_len);
script_patch_32_abs(hostdata->script, SGScriptStartAddress,
virt_to_bus(&slot->SG[0].ins));
/* Note: setting SXFER only works if we're
* still in the MESSAGE phase, so it is vital
* that ACK is still asserted when we process
* the reselection message. The resume offset
* should therefore always clear ACK */
NCR_700_writeb(NCR_700_get_SXFER(hostdata->cmd->device),
host, SXFER_REG);
}
} else if(dsps == A_RESELECTED_DURING_SELECTION) {
/* This section is full of debugging code because I've
* never managed to reach it. I think what happens is
* that, because the 700 runs with selection
* interrupts enabled the whole time that we take a
* selection interrupt before we manage to get to the
* reselected script interrupt */
__u8 reselection_id = NCR_700_readb(host, SFBR_REG);
struct NCR_700_command_slot *slot;
/* Take out our own ID */
reselection_id &= ~(1<<host->this_id);
printk(KERN_INFO "scsi%d: (%d:%d) RESELECTION DURING SELECTION, dsp=%p[%04x] state=%d, count=%d\n",
host->host_no, reselection_id, lun, (void *)dsp, dsp - hostdata->pScript, hostdata->state, hostdata->command_slot_count);
{
/* FIXME: DEBUGGING CODE */
__u32 SG = (__u32)bus_to_virt(hostdata->script[A_SGScriptStartAddress_used[0]]);
int i;
for(i=0; i< NCR_700_COMMAND_SLOTS_PER_HOST; i++) {
if(SG >= (__u32)(&hostdata->slots[i].SG[0])
&& SG <= (__u32)(&hostdata->slots[i].SG[NCR_700_SG_SEGMENTS]))
break;
}
printk(KERN_INFO "IDENTIFIED SG segment as being %p in slot %p, cmd %p, slot->resume_offset=%p\n", (void *)SG, &hostdata->slots[i], hostdata->slots[i].cmnd, (void *)hostdata->slots[i].resume_offset);
SCp = hostdata->slots[i].cmnd;
}
if(SCp != NULL) {
slot = (struct NCR_700_command_slot *)SCp->host_scribble;
/* change slot from busy to queued to redo command */
slot->state = NCR_700_SLOT_QUEUED;
}
hostdata->cmd = NULL;
if(reselection_id == 0) {
if(hostdata->reselection_id == 0xff) {
printk(KERN_ERR "scsi%d: Invalid reselection during selection!!\n", host->host_no);
return 0;
} else {
printk(KERN_ERR "scsi%d: script reselected and we took a selection interrupt\n",
host->host_no);
reselection_id = hostdata->reselection_id;
}
} else {
/* convert to real ID */
reselection_id = bitmap_to_number(reselection_id);
}
hostdata->reselection_id = reselection_id;
hostdata->msgin[1] = 0;
dma_cache_wback((unsigned long)hostdata->msgin, sizeof(hostdata->msgin));
if(hostdata->tag_negotiated & (1<<reselection_id)) {
resume_offset = hostdata->pScript + Ent_GetReselectionWithTag;
} else {
resume_offset = hostdata->pScript + Ent_GetReselectionData;
}
} else if(dsps == A_COMPLETED_SELECTION_AS_TARGET) {
/* we've just disconnected from the bus, do nothing since
* a return here will re-run the queued command slot
* that may have been interrupted by the initial selection */
DEBUG((" SELECTION COMPLETED\n"));
} else if((dsps & 0xfffff0f0) == A_MSG_IN) {
resume_offset = process_message(host, hostdata, SCp,
dsp, dsps);
} else if((dsps & 0xfffff000) == 0) {
__u8 i = (dsps & 0xf0) >> 4, j = (dsps & 0xf00) >> 8;
printk(KERN_ERR "scsi%d: (%d:%d), unhandled script condition %s %s at %04x\n",
host->host_no, pun, lun, NCR_700_condition[i],
NCR_700_phase[j], dsp - hostdata->pScript);
if(SCp != NULL) {
print_command(SCp->cmnd);
if(SCp->use_sg) {
for(i = 0; i < SCp->use_sg + 1; i++) {
printk(KERN_INFO " SG[%d].length = %d, move_insn=%08x, addr %08x\n", i, ((struct scatterlist *)SCp->buffer)[i].length, ((struct NCR_700_command_slot *)SCp->host_scribble)->SG[i].ins, ((struct NCR_700_command_slot *)SCp->host_scribble)->SG[i].pAddr);
}
}
}
NCR_700_internal_bus_reset(host);
} else if((dsps & 0xfffff000) == A_DEBUG_INTERRUPT) {
printk(KERN_NOTICE "scsi%d (%d:%d) DEBUG INTERRUPT %d AT %p[%04x], continuing\n",
host->host_no, pun, lun, dsps & 0xfff, (void *)dsp, dsp - hostdata->pScript);
resume_offset = dsp;
} else {
printk(KERN_ERR "scsi%d: (%d:%d), unidentified script interrupt 0x%x at %04x\n",
host->host_no, pun, lun, dsps, dsp - hostdata->pScript);
NCR_700_internal_bus_reset(host);
}
return resume_offset;
}
/* We run the 53c700 with selection interrupts always enabled. This
* means that the chip may be selected as soon as the bus frees. On a
* busy bus, this can be before the scripts engine finishes its
* processing. Therefore, part of the selection processing has to be
* to find out what the scripts engine is doing and complete the
* function if necessary (i.e. process the pending disconnect or save
* the interrupted initial selection */
STATIC inline __u32
process_selection(struct Scsi_Host *host, __u32 dsp)
{
__u8 id = 0; /* Squash compiler warning */
int count = 0;
__u32 resume_offset = 0;
struct NCR_700_Host_Parameters *hostdata =
(struct NCR_700_Host_Parameters *)host->hostdata[0];
Scsi_Cmnd *SCp = hostdata->cmd;
__u8 sbcl;
for(count = 0; count < 5; count++) {
id = NCR_700_readb(host, SFBR_REG);
/* Take out our own ID */
id &= ~(1<<host->this_id);
if(id != 0)
break;
udelay(5);
}
sbcl = NCR_700_readb(host, SBCL_REG);
if((sbcl & SBCL_IO) == 0) {
/* mark as having been selected rather than reselected */
id = 0xff;
} else {
/* convert to real ID */
hostdata->reselection_id = id = bitmap_to_number(id);
DEBUG(("scsi%d: Reselected by %d\n",
host->host_no, id));
}
if(hostdata->state == NCR_700_HOST_BUSY && SCp != NULL) {
struct NCR_700_command_slot *slot =
(struct NCR_700_command_slot *)SCp->host_scribble;
DEBUG((" ID %d WARNING: RESELECTION OF BUSY HOST, saving cmd %p, slot %p, addr %x [%04x], resume %x!\n", id, hostdata->cmd, slot, dsp, dsp - hostdata->pScript, resume_offset));
switch(dsp - hostdata->pScript) {
case Ent_Disconnect1:
case Ent_Disconnect2:
save_for_reselection(hostdata, SCp, Ent_Disconnect2 + hostdata->pScript);
break;
case Ent_Disconnect3:
case Ent_Disconnect4:
save_for_reselection(hostdata, SCp, Ent_Disconnect4 + hostdata->pScript);
break;
case Ent_Disconnect5:
case Ent_Disconnect6:
save_for_reselection(hostdata, SCp, Ent_Disconnect6 + hostdata->pScript);
break;
case Ent_Disconnect7:
case Ent_Disconnect8:
save_for_reselection(hostdata, SCp, Ent_Disconnect8 + hostdata->pScript);
break;
case Ent_Finish1:
case Ent_Finish2:
process_script_interrupt(A_GOOD_STATUS_AFTER_STATUS, dsp, SCp, host, hostdata);
break;
default:
slot->state = NCR_700_SLOT_QUEUED;
break;
}
}
hostdata->state = NCR_700_HOST_BUSY;
hostdata->cmd = NULL;
hostdata->msgin[1] = 0;
dma_cache_wback((unsigned long)hostdata->msgin, sizeof(hostdata->msgin));
if(id == 0xff) {
/* Selected as target, Ignore */
resume_offset = hostdata->pScript + Ent_SelectedAsTarget;
} else if(hostdata->tag_negotiated & (1<<id)) {
resume_offset = hostdata->pScript + Ent_GetReselectionWithTag;
} else {
resume_offset = hostdata->pScript + Ent_GetReselectionData;
}
return resume_offset;
}
STATIC int
NCR_700_start_command(Scsi_Cmnd *SCp)
{
struct NCR_700_command_slot *slot =
(struct NCR_700_command_slot *)SCp->host_scribble;
struct NCR_700_Host_Parameters *hostdata =
(struct NCR_700_Host_Parameters *)SCp->host->hostdata[0];
unsigned long flags;
__u16 count = 1; /* for IDENTIFY message */
save_flags(flags);
cli();
if(hostdata->state != NCR_700_HOST_FREE) {
/* keep this inside the lock to close the race window where
* the running command finishes on another CPU while we don't
* change the state to queued on this one */
slot->state = NCR_700_SLOT_QUEUED;
restore_flags(flags);
DEBUG(("scsi%d: host busy, queueing command %p, slot %p\n",
SCp->host->host_no, slot->cmnd, slot));
return 0;
}
hostdata->state = NCR_700_HOST_BUSY;
hostdata->cmd = SCp;
slot->state = NCR_700_SLOT_BUSY;
/* keep interrupts disabled until we have the command correctly
* set up so we cannot take a selection interrupt */
hostdata->msgout[0] = NCR_700_identify(SCp->cmnd[0] != REQUEST_SENSE,
SCp->lun);
/* for INQUIRY or REQUEST_SENSE commands, we cannot be sure
* if the negotiated transfer parameters still hold, so
* always renegotiate them */
if(SCp->cmnd[0] == INQUIRY || SCp->cmnd[0] == REQUEST_SENSE) {
NCR_700_clear_flag(SCp->device, NCR_700_DEV_NEGOTIATED_SYNC);
}
/* REQUEST_SENSE is asking for contingent I_T_L status. If a
* contingent allegiance condition exists, the device will
* refuse all tags, so send the request sense as untagged */
if((hostdata->tag_negotiated & (1<<SCp->target))
&& (slot->tag != NCR_700_NO_TAG && SCp->cmnd[0] != REQUEST_SENSE)) {
hostdata->msgout[count++] = A_SIMPLE_TAG_MSG;
hostdata->msgout[count++] = slot->tag;
}
if(hostdata->fast &&
NCR_700_is_flag_clear(SCp->device, NCR_700_DEV_NEGOTIATED_SYNC)) {
memcpy(&hostdata->msgout[count], NCR_700_SDTR_msg,
sizeof(NCR_700_SDTR_msg));
count += sizeof(NCR_700_SDTR_msg);
NCR_700_set_flag(SCp->device, NCR_700_DEV_BEGIN_SYNC_NEGOTIATION);
}
dma_cache_wback((unsigned long)hostdata->msgout, count);
script_patch_16(hostdata->script, MessageCount, count);
script_patch_ID(hostdata->script,
Device_ID, 1<<SCp->target);
script_patch_32_abs(hostdata->script, CommandAddress,
virt_to_bus(SCp->cmnd));
script_patch_16(hostdata->script, CommandCount, SCp->cmd_len);
/* finally plumb the beginning of the SG list into the script
* */
script_patch_32_abs(hostdata->script, SGScriptStartAddress,
virt_to_bus(&slot->SG[0].ins));
NCR_700_writeb(CLR_FIFO, SCp->host, DFIFO_REG);
/* set the synchronous period/offset */
if(slot->resume_offset == 0)
slot->resume_offset = hostdata->pScript;
NCR_700_writeb(NCR_700_get_SXFER(SCp->device),
SCp->host, SXFER_REG);
/* allow interrupts here so that if we're selected we can take
* a selection interrupt. The script start may not be
* effective in this case, but the selection interrupt will
* save our command in that case */
NCR_700_writel(slot->temp, SCp->host, TEMP_REG);
NCR_700_writel(slot->resume_offset, SCp->host, DSP_REG);
restore_flags(flags);
return 1;
}
void
NCR_700_intr(int irq, void *dev_id, struct pt_regs *regs)
{
struct Scsi_Host *host = (struct Scsi_Host *)dev_id;
struct NCR_700_Host_Parameters *hostdata =
(struct NCR_700_Host_Parameters *)host->hostdata[0];
__u8 istat;
__u32 resume_offset = 0;
__u8 pun = 0xff, lun = 0xff;
unsigned long flags;
/* Unfortunately, we have to take the io_request_lock here
* rather than the host lock hostdata->lock because we're
* looking to exclude queuecommand from messing with the
* registers while we're processing the interrupt. Since
* queuecommand is called holding io_request_lock, and we have
* to take io_request_lock before we call the command
* scsi_done, we would get a deadlock if we took
* hostdata->lock here and in queuecommand (because the order
* of locking in queuecommand: 1) io_request_lock then 2)
* hostdata->lock would be the reverse of taking it in this
* routine */
spin_lock_irqsave(&io_request_lock, flags);
if((istat = NCR_700_readb(host, ISTAT_REG))
& (SCSI_INT_PENDING | DMA_INT_PENDING)) {
__u32 dsps;
__u8 sstat0 = 0, dstat = 0;
__u32 dsp;
Scsi_Cmnd *SCp = hostdata->cmd;
enum NCR_700_Host_State state;
state = hostdata->state;
SCp = hostdata->cmd;
if(istat & SCSI_INT_PENDING) {
udelay(10);
sstat0 = NCR_700_readb(host, SSTAT0_REG);
}
if(istat & DMA_INT_PENDING) {
udelay(10);
dstat = NCR_700_readb(host, DSTAT_REG);
}
dsps = NCR_700_readl(host, DSPS_REG);
dsp = NCR_700_readl(host, DSP_REG);
DEBUG(("scsi%d: istat %02x sstat0 %02x dstat %02x dsp %04x[%08x] dsps 0x%x\n",
host->host_no, istat, sstat0, dstat,
(dsp - (__u32)virt_to_bus(hostdata->script))/4,
dsp, dsps));
if(SCp != NULL) {
pun = SCp->target;
lun = SCp->lun;
}
if(sstat0 & SCSI_RESET_DETECTED) {
Scsi_Device *SDp;
int i;
hostdata->state = NCR_700_HOST_BUSY;
printk(KERN_ERR "scsi%d: Bus Reset detected, executing command %p, slot %p, dsp %p[%04x]\n",
host->host_no, SCp, SCp == NULL ? NULL : SCp->host_scribble, (void *)dsp, dsp - hostdata->pScript);
/* clear all the negotiated parameters */
for(SDp = host->host_queue; SDp != NULL; SDp = SDp->next)
SDp->hostdata = 0;
/* clear all the slots and their pending commands */
for(i = 0; i < NCR_700_COMMAND_SLOTS_PER_HOST; i++) {
Scsi_Cmnd *SCp;
struct NCR_700_command_slot *slot =
&hostdata->slots[i];
if(slot->state == NCR_700_SLOT_FREE)
continue;
SCp = slot->cmnd;
printk(KERN_ERR " failing command because of reset, slot %p, cmnd %p\n",
slot, SCp);
free_slot(slot, hostdata);
SCp->host_scribble = NULL;
NCR_700_set_depth(SCp->device, 0);
/* NOTE: deadlock potential here: we
* rely on mid-layer guarantees that
* scsi_done won't try to issue the
* command again otherwise we'll
* deadlock on the
* hostdata->state_lock */
SCp->result = DID_RESET << 16;
SCp->scsi_done(SCp);
}
mdelay(25);
NCR_700_chip_setup(host);
hostdata->state = NCR_700_HOST_FREE;
hostdata->cmd = NULL;
goto out_unlock;
} else if(sstat0 & SELECTION_TIMEOUT) {
DEBUG(("scsi%d: (%d:%d) selection timeout\n",
host->host_no, pun, lun));
NCR_700_scsi_done(hostdata, SCp, DID_NO_CONNECT<<16);
} else if(sstat0 & PHASE_MISMATCH) {
struct NCR_700_command_slot *slot = (SCp == NULL) ? NULL :
(struct NCR_700_command_slot *)SCp->host_scribble;
if(dsp == Ent_SendMessage + 8 + hostdata->pScript) {
/* It wants to reply to some part of
* our message */
#ifdef NCR_700_DEBUG
__u32 temp = NCR_700_readl(host, TEMP_REG);
int count = (hostdata->script[Ent_SendMessage/4] & 0xffffff) - ((NCR_700_readl(host, DBC_REG) & 0xffffff) + NCR_700_data_residual(host));
printk("scsi%d (%d:%d) PHASE MISMATCH IN SEND MESSAGE %d remain, return %p[%04x], phase %s\n", host->host_no, pun, lun, count, (void *)temp, temp - hostdata->pScript, sbcl_to_string(NCR_700_readb(host, SBCL_REG)));
#endif
resume_offset = hostdata->pScript + Ent_SendMessagePhaseMismatch;
} else if(dsp >= virt_to_bus(&slot->SG[0].ins) &&
dsp <= virt_to_bus(&slot->SG[NCR_700_SG_SEGMENTS].ins)) {
int data_transfer = NCR_700_readl(host, DBC_REG) & 0xffffff;
int SGcount = (dsp - virt_to_bus(&slot->SG[0].ins))/sizeof(struct NCR_700_SG_List);
int residual = NCR_700_data_residual(host);
int i;
#ifdef NCR_700_DEBUG
printk("scsi%d: (%d:%d) Expected phase mismatch in slot->SG[%d], transferred 0x%x\n",
host->host_no, pun, lun,
SGcount, data_transfer);
print_command(SCp->cmnd);
if(residual) {
printk("scsi%d: (%d:%d) Expected phase mismatch in slot->SG[%d], transferred 0x%x, residual %d\n",
host->host_no, pun, lun,
SGcount, data_transfer, residual);
}
#endif
data_transfer += residual;
if(data_transfer != 0) {
int count;
__u32 pAddr;
SGcount--;
count = (bS_to_cpu(slot->SG[SGcount].ins) & 0x00ffffff);
DEBUG(("DATA TRANSFER MISMATCH, count = %d, transferred %d\n", count, count-data_transfer));
slot->SG[SGcount].ins &= bS_to_host(0xff000000);
slot->SG[SGcount].ins |= bS_to_host(data_transfer);
pAddr = bS_to_cpu(slot->SG[SGcount].pAddr);
pAddr += (count - data_transfer);
slot->SG[SGcount].pAddr = bS_to_host(pAddr);
}
/* set the executed moves to nops */
for(i=0; i<SGcount; i++) {
slot->SG[i].ins = bS_to_host(SCRIPT_NOP);
slot->SG[i].pAddr = 0;
}
dma_cache_wback((unsigned long)slot->SG, sizeof(slot->SG));
/* and pretend we disconnected after
* the command phase */
resume_offset = hostdata->pScript + Ent_MsgInDuringData;
} else {
__u8 sbcl = NCR_700_readb(host, SBCL_REG);
printk(KERN_ERR "scsi%d: (%d:%d) phase mismatch at %04x, phase %s\n",
host->host_no, pun, lun, dsp - hostdata->pScript, sbcl_to_string(sbcl));
NCR_700_internal_bus_reset(host);
}
} else if(sstat0 & SCSI_GROSS_ERROR) {
printk(KERN_ERR "scsi%d: (%d:%d) GROSS ERROR\n",
host->host_no, pun, lun);
NCR_700_scsi_done(hostdata, SCp, DID_ERROR<<16);
} else if(dstat & SCRIPT_INT_RECEIVED) {
DEBUG(("scsi%d: (%d:%d) ====>SCRIPT INTERRUPT<====\n",
host->host_no, pun, lun));
resume_offset = process_script_interrupt(dsps, dsp, SCp, host, hostdata);
} else if(dstat & (ILGL_INST_DETECTED)) {
printk(KERN_ERR "scsi%d: (%d:%d) Illegal Instruction detected at 0x%p[0x%x]!!!\n"
" Please email James.Bottomley@HansenPartnership.com with the details\n",
host->host_no, pun, lun,
(void *)dsp, dsp - hostdata->pScript);
NCR_700_scsi_done(hostdata, SCp, DID_ERROR<<16);
} else if(dstat & (WATCH_DOG_INTERRUPT|ABORTED)) {
printk(KERN_ERR "scsi%d: (%d:%d) serious DMA problem, dstat=%02x\n",
host->host_no, pun, lun, dstat);
NCR_700_scsi_done(hostdata, SCp, DID_ERROR<<16);
}
/* NOTE: selection interrupt processing MUST occur
* after script interrupt processing to correctly cope
* with the case where we process a disconnect and
* then get reselected before we process the
* disconnection */
if(sstat0 & SELECTED) {
/* FIXME: It currently takes at least FOUR
* interrupts to complete a command that
* disconnects: one for the disconnect, one
* for the reselection, one to get the
* reselection data and one to complete the
* command. If we guess the reselected
* command here and prepare it, we only need
* to get a reselection data interrupt if we
* guessed wrongly. Since the interrupt
* overhead is much greater than the command
* setup, this would be an efficient
* optimisation particularly as we probably
* only have one outstanding command on a
* target most of the time */
resume_offset = process_selection(host, dsp);
}
}
if(resume_offset) {
if(hostdata->state != NCR_700_HOST_BUSY) {
printk(KERN_ERR "scsi%d: Driver error: resume at %p [%04x] with non busy host!\n",
host->host_no, (void *)resume_offset, resume_offset - hostdata->pScript);
hostdata->state = NCR_700_HOST_BUSY;
}
DEBUG(("Attempting to resume at %x\n", resume_offset));
NCR_700_writeb(CLR_FIFO, host, DFIFO_REG);
NCR_700_writel(resume_offset, host, DSP_REG);
}
/* There is probably a technical no-no about this: If we're a
* shared interrupt and we got this interrupt because the
* other device needs servicing not us, we're still going to
* check our queued commands here---of course, there shouldn't
* be any outstanding.... */
if(hostdata->state == NCR_700_HOST_FREE) {
int i;
for(i = 0; i < NCR_700_COMMAND_SLOTS_PER_HOST; i++) {
/* fairness: always run the queue from the last
* position we left off */
int j = (i + hostdata->saved_slot_position)
% NCR_700_COMMAND_SLOTS_PER_HOST;
if(hostdata->slots[j].state != NCR_700_SLOT_QUEUED)
continue;
if(NCR_700_start_command(hostdata->slots[j].cmnd)) {
DEBUG(("scsi%d: Issuing saved command slot %p, cmd %p\t\n",
host->host_no, &hostdata->slots[j],
hostdata->slots[j].cmnd));
hostdata->saved_slot_position = j + 1;
}
break;
}
}
out_unlock:
spin_unlock_irqrestore(&io_request_lock, flags);
}
/* FIXME: Need to put some proc information in and plumb it
* into the scsi proc system */
STATIC int
NCR_700_proc_directory_info(char *proc_buf, char **startp,
off_t offset, int bytes_available,
int host_no, int write)
{
static char buf[4096]; /* 1 page should be sufficient */
int len = 0;
struct Scsi_Host *host = scsi_hostlist;
struct NCR_700_Host_Parameters *hostdata;
Scsi_Device *SDp;
while(host != NULL && host->host_no != host_no)
host = host->next;
if(host == NULL)
return 0;
if(write) {
/* FIXME: Clear internal statistics here */
return 0;
}
hostdata = (struct NCR_700_Host_Parameters *)host->hostdata[0];
len += sprintf(&buf[len], "Total commands outstanding: %d\n", hostdata->command_slot_count);
len += sprintf(&buf[len],"\
Target Depth Active Next Tag\n\
====== ===== ====== ========\n");
for(SDp = host->host_queue; SDp != NULL; SDp = SDp->next) {
len += sprintf(&buf[len]," %2d:%2d %4d %4d %4d\n", SDp->id, SDp->lun, SDp->queue_depth, NCR_700_get_depth(SDp), SDp->current_tag);
}
if((len -= offset) <= 0)
return 0;
if(len > bytes_available)
len = bytes_available;
memcpy(proc_buf, buf + offset, len);
return len;
}
STATIC int
NCR_700_queuecommand(Scsi_Cmnd *SCp, void (*done)(Scsi_Cmnd *))
{
struct NCR_700_Host_Parameters *hostdata =
(struct NCR_700_Host_Parameters *)SCp->host->hostdata[0];
__u32 move_ins;
struct NCR_700_command_slot *slot;
int hash;
if(hostdata->command_slot_count >= NCR_700_COMMAND_SLOTS_PER_HOST) {
/* We're over our allocation, this should never happen
* since we report the max allocation to the mid layer */
printk(KERN_WARNING "scsi%d: Command depth has gone over queue depth\n", SCp->host->host_no);
return 1;
}
if(NCR_700_get_depth(SCp->device) != 0 && !(hostdata->tag_negotiated & (1<<SCp->target))) {
DEBUG((KERN_ERR "scsi%d (%d:%d) has non zero depth %d\n",
SCp->host->host_no, SCp->target, SCp->lun,
NCR_700_get_depth(SCp->device)));
return 1;
}
if(NCR_700_get_depth(SCp->device) >= NCR_700_MAX_TAGS) {
DEBUG((KERN_ERR "scsi%d (%d:%d) has max tag depth %d\n",
SCp->host->host_no, SCp->target, SCp->lun,
NCR_700_get_depth(SCp->device)));
return 1;
}
NCR_700_set_depth(SCp->device, NCR_700_get_depth(SCp->device) + 1);
/* begin the command here */
/* no need to check for NULL, test for command_slot_cound above
* ensures a slot is free */
slot = find_empty_slot(hostdata);
slot->cmnd = SCp;
SCp->scsi_done = done;
SCp->host_scribble = (unsigned char *)slot;
SCp->SCp.ptr = NULL;
SCp->SCp.buffer = NULL;
#ifdef NCR_700_DEBUG
printk("53c700: scsi%d, command ", SCp->host->host_no);
print_command(SCp->cmnd);
#endif
if(hostdata->tag_negotiated &(1<<SCp->target)) {
struct NCR_700_command_slot *old =
find_ITL_Nexus(hostdata, SCp->target, SCp->lun);
#ifdef NCR_700_TAG_DEBUG
struct NCR_700_command_slot *found;
#endif
if(old != NULL && old->tag == SCp->device->current_tag) {
printk(KERN_WARNING "scsi%d (%d:%d) Tag clock back to current, queueing\n", SCp->host->host_no, SCp->target, SCp->lun);
return 1;
}
slot->tag = SCp->device->current_tag++;
#ifdef NCR_700_TAG_DEBUG
while((found = find_ITLQ_Nexus(hostdata, SCp->target, SCp->lun, slot->tag)) != NULL) {
printk("\n\n**ERROR** already using tag %d, but oldest is %d\n", slot->tag, (old == NULL) ? -1 : old->tag);
printk(" FOUND = %p, tag = %d, pun = %d, lun = %d\n",
found, found->tag, found->cmnd->target, found->cmnd->lun);
slot->tag = SCp->device->current_tag++;
printk(" Tag list is: ");
while(old != NULL) {
if(old->cmnd->target == SCp->target &&
old->cmnd->lun == SCp->lun)
printk("%d ", old->tag);
old = old->ITL_back;
}
printk("\n\n");
}
#endif
hash = hash_ITLQ(SCp->target, SCp->lun, slot->tag);
/* link into the ITLQ hash queues */
slot->ITLQ_forw = hostdata->ITLQ_Hash_forw[hash];
hostdata->ITLQ_Hash_forw[hash] = slot;
#ifdef NCR_700_TAG_DEBUG
if(slot->ITLQ_forw != NULL && slot->ITLQ_forw->ITLQ_back != NULL) {
printk(KERN_ERR "scsi%d (%d:%d) ITLQ_back is not NULL!!!!\n", SCp->host->host_no, SCp->target, SCp->lun);
}
#endif
if(slot->ITLQ_forw != NULL)
slot->ITLQ_forw->ITLQ_back = slot;
else
hostdata->ITLQ_Hash_back[hash] = slot;
slot->ITLQ_back = NULL;
} else {
slot->tag = NCR_700_NO_TAG;
}
/* link into the ITL hash queues */
hash = hash_ITL(SCp->target, SCp->lun);
slot->ITL_forw = hostdata->ITL_Hash_forw[hash];
hostdata->ITL_Hash_forw[hash] = slot;
#ifdef NCR_700_TAG_DEBUG
if(slot->ITL_forw != NULL && slot->ITL_forw->ITL_back != NULL) {
printk(KERN_ERR "scsi%d (%d:%d) ITL_back is not NULL!!!!\n",
SCp->host->host_no, SCp->target, SCp->lun);
}
#endif
if(slot->ITL_forw != NULL)
slot->ITL_forw->ITL_back = slot;
else
hostdata->ITL_Hash_back[hash] = slot;
slot->ITL_back = NULL;
/* This is f****g ridiculous; every low level HBA driver has
* to determine the direction of the commands, why isn't this
* done inside the scsi_lib !!??? */
switch (SCp->cmnd[0]) {
case REQUEST_SENSE:
/* clear the internal sense magic */
SCp->cmnd[6] = 0;
/* fall through */
case INQUIRY:
case MODE_SENSE:
case READ_6:
case READ_10:
case READ_12:
case READ_CAPACITY:
case READ_BLOCK_LIMITS:
case READ_TOC:
move_ins = SCRIPT_MOVE_DATA_IN;
break;
case MODE_SELECT:
case WRITE_6:
case WRITE_10:
case WRITE_12:
move_ins = SCRIPT_MOVE_DATA_OUT;
break;
case TEST_UNIT_READY:
case ALLOW_MEDIUM_REMOVAL:
case START_STOP:
move_ins = 0;
break;
default:
/* OK, get it from the command */
switch(SCp->sc_data_direction) {
case SCSI_DATA_UNKNOWN:
default:
printk(KERN_ERR "53c700: Unknown command for data direction ");
print_command(SCp->cmnd);
move_ins = 0;
break;
case SCSI_DATA_NONE:
move_ins = 0;
break;
case SCSI_DATA_READ:
move_ins = SCRIPT_MOVE_DATA_IN;
break;
case SCSI_DATA_WRITE:
move_ins = SCRIPT_MOVE_DATA_OUT;
break;
}
}
/* now build the scatter gather list */
if(move_ins != 0) {
int i;
for(i = 0; i < (SCp->use_sg ? SCp->use_sg : 1); i++) {
void *vPtr;
__u32 count;
if(SCp->use_sg) {
vPtr = (((struct scatterlist *)SCp->buffer)[i].address);
count = ((struct scatterlist *)SCp->buffer)[i].length;
} else {
vPtr = SCp->request_buffer;
count = SCp->request_bufflen;
}
slot->SG[i].ins = bS_to_host(move_ins | count);
DEBUG((" scatter block %d: move %d[%08x] from 0x%lx\n",
i, count, slot->SG[i].ins,
virt_to_bus(vPtr)));
dma_cache_wback_inv((unsigned long)vPtr, count);
slot->SG[i].pAddr = bS_to_host(virt_to_bus(vPtr));
}
slot->SG[i].ins = bS_to_host(SCRIPT_RETURN);
slot->SG[i].pAddr = 0;
dma_cache_wback((unsigned long)slot->SG, sizeof(slot->SG));
DEBUG((" SETTING %08lx to %x\n",
virt_to_bus(&slot->SG[i].ins),
slot->SG[i].ins));
}
slot->resume_offset = 0;
NCR_700_start_command(SCp);
return 0;
}
STATIC int
NCR_700_abort(Scsi_Cmnd * SCp)
{
struct NCR_700_command_slot *slot;
struct NCR_700_Host_Parameters *hostdata =
(struct NCR_700_Host_Parameters *)SCp->host->hostdata[0];
printk(KERN_INFO "scsi%d (%d:%d) New error handler wants to abort command\n\t",
SCp->host->host_no, SCp->target, SCp->lun);
print_command(SCp->cmnd);
slot = find_ITL_Nexus(hostdata, SCp->target, SCp->lun);
while(slot != NULL && slot->cmnd != SCp)
slot = slot->ITL_back;
if(slot == NULL)
/* no outstanding command to abort */
return SUCCESS;
if(SCp->cmnd[0] == TEST_UNIT_READY) {
/* FIXME: This is because of a problem in the new
* error handler. When it is in error recovery, it
* will send a TUR to a device it thinks may still be
* showing a problem. If the TUR isn't responded to,
* it will abort it and mark the device off line.
* Unfortunately, it does no other error recovery, so
* this would leave us with an outstanding command
* occupying a slot. Rather than allow this to
* happen, we issue a bus reset to force all
* outstanding commands to terminate here. */
NCR_700_internal_bus_reset(SCp->host);
/* still drop through and return failed */
}
return FAILED;
}
STATIC int
NCR_700_bus_reset(Scsi_Cmnd * SCp)
{
printk(KERN_INFO "scsi%d (%d:%d) New error handler wants BUS reset, cmd %p\n\t",
SCp->host->host_no, SCp->target, SCp->lun, SCp);
print_command(SCp->cmnd);
NCR_700_internal_bus_reset(SCp->host);
return SUCCESS;
}
STATIC int
NCR_700_dev_reset(Scsi_Cmnd * SCp)
{
printk(KERN_INFO "scsi%d (%d:%d) New error handler wants device reset\n\t",
SCp->host->host_no, SCp->target, SCp->lun);
print_command(SCp->cmnd);
return FAILED;
}
STATIC int
NCR_700_host_reset(Scsi_Cmnd * SCp)
{
printk(KERN_INFO "scsi%d (%d:%d) New error handler wants HOST reset\n\t",
SCp->host->host_no, SCp->target, SCp->lun);
print_command(SCp->cmnd);
NCR_700_internal_bus_reset(SCp->host);
NCR_700_chip_reset(SCp->host);
return SUCCESS;
}
EXPORT_SYMBOL(NCR_700_detect);
EXPORT_SYMBOL(NCR_700_release);
EXPORT_SYMBOL(NCR_700_intr);
......@@ -12,6 +12,7 @@
*
* History:
*
* 2001/09/19 USB_ZERO_PACKET support (Jean Tourrilhes)
* 2001/07/17 power management and pmac cleanup (Benjamin Herrenschmidt)
* 2001/03/24 td/ed hashing to remove bus_to_virt (Steve Longerbeam);
pci_map_single (db)
......@@ -537,6 +538,7 @@ static int sohci_submit_urb (urb_t * urb)
ed_t * ed;
urb_priv_t * urb_priv;
unsigned int pipe = urb->pipe;
int maxps = usb_maxpacket (urb->dev, pipe, usb_pipeout (pipe));
int i, size = 0;
unsigned long flags;
int bustime = 0;
......@@ -579,6 +581,15 @@ static int sohci_submit_urb (urb_t * urb)
switch (usb_pipetype (pipe)) {
case PIPE_BULK: /* one TD for every 4096 Byte */
size = (urb->transfer_buffer_length - 1) / 4096 + 1;
/* If the transfer size is multiple of the pipe mtu,
* we may need an extra TD to create a empty frame
* Jean II */
if ((urb->transfer_flags & USB_ZERO_PACKET) &&
usb_pipeout (pipe) &&
(urb->transfer_buffer_length != 0) &&
((urb->transfer_buffer_length % maxps) == 0))
size++;
break;
case PIPE_ISOCHRONOUS: /* number of packets from URB */
size = urb->number_of_packets;
......@@ -1338,6 +1349,7 @@ static void td_submit_urb (urb_t * urb)
ohci_t * ohci = (ohci_t *) urb->dev->bus->hcpriv;
dma_addr_t data;
int data_len = urb->transfer_buffer_length;
int maxps = usb_maxpacket (urb->dev, urb->pipe, usb_pipeout (urb->pipe));
int cnt = 0;
__u32 info = 0;
unsigned int toggle = 0;
......@@ -1374,6 +1386,19 @@ static void td_submit_urb (urb_t * urb)
TD_CC | TD_DP_OUT : TD_CC | TD_R | TD_DP_IN ;
td_fill (ohci, info | (cnt? TD_T_TOGGLE:toggle), data, data_len, urb, cnt);
cnt++;
/* If the transfer size is multiple of the pipe mtu,
* we may need an extra TD to create a empty frame
* Note : another way to check this condition is
* to test if(urb_priv->length > cnt) - Jean II */
if ((urb->transfer_flags & USB_ZERO_PACKET) &&
usb_pipeout (urb->pipe) &&
(urb->transfer_buffer_length != 0) &&
((urb->transfer_buffer_length % maxps) == 0)) {
td_fill (ohci, info | (cnt? TD_T_TOGGLE:toggle), 0, 0, urb, cnt);
cnt++;
}
if (!ohci->sleeping)
writel (OHCI_BLF, &ohci->regs->cmdstatus); /* start bulk list */
break;
......
......@@ -697,7 +697,6 @@ int blkdev_get(struct block_device *bdev, mode_t mode, unsigned flags, int kind)
ret = bdev->bd_op->open(bdev->bd_inode, &fake_file);
if (!ret) {
bdev->bd_openers++;
atomic_inc(&bdev->bd_count);
} else if (!bdev->bd_openers) {
struct inode *bd_inode = bdev->bd_inode;
bdev->bd_op = NULL;
......@@ -778,7 +777,6 @@ int blkdev_put(struct block_device *bdev, int kind)
}
unlock_kernel();
up(&bdev->bd_sem);
bdput(bdev);
return ret;
}
......
......@@ -388,7 +388,7 @@
Work sponsored by SGI.
v0.83
19991107 Richard Gooch <rgooch@atnf.csiro.au>
Added DEVFS_ FL_WAIT flag.
Added DEVFS_FL_WAIT flag.
Work sponsored by SGI.
v0.84
19991107 Richard Gooch <rgooch@atnf.csiro.au>
......@@ -511,6 +511,40 @@
Removed broken devnum allocation and use <devfs_alloc_devnum>.
Fixed old devnum leak by calling new <devfs_dealloc_devnum>.
v0.107
20010712 Richard Gooch <rgooch@atnf.csiro.au>
Fixed bug in <devfs_setup> which could hang boot process.
v0.108
20010730 Richard Gooch <rgooch@atnf.csiro.au>
Added DEVFSD_NOTIFY_DELETE event.
20010801 Richard Gooch <rgooch@atnf.csiro.au>
Removed #include <asm/segment.h>.
v0.109
20010807 Richard Gooch <rgooch@atnf.csiro.au>
Fixed inode table races by removing it and using
inode->u.generic_ip instead.
Moved <devfs_read_inode> into <get_vfs_inode>.
Moved <devfs_write_inode> into <devfs_notify_change>.
v0.110
20010808 Richard Gooch <rgooch@atnf.csiro.au>
Fixed race in <devfs_do_symlink> for uni-processor.
v0.111
20010818 Richard Gooch <rgooch@atnf.csiro.au>
Removed remnant of multi-mount support in <devfs_mknod>.
Removed unused DEVFS_FL_SHOW_UNREG flag.
v0.112
20010820 Richard Gooch <rgooch@atnf.csiro.au>
Removed nlink field from struct devfs_inode.
v0.113
20010823 Richard Gooch <rgooch@atnf.csiro.au>
Replaced BKL with global rwsem to protect symlink data (quick
and dirty hack).
v0.114
20010827 Richard Gooch <rgooch@atnf.csiro.au>
Replaced global rwsem for symlink with per-link refcount.
v0.115
20010919 Richard Gooch <rgooch@atnf.csiro.au>
Set inode->i_mapping->a_ops for block nodes in <get_vfs_inode>.
v0.116
*/
#include <linux/types.h>
#include <linux/errno.h>
......@@ -533,21 +567,20 @@
#include <linux/smp_lock.h>
#include <linux/smp.h>
#include <linux/version.h>
#include <linux/rwsem.h>
#include <asm/uaccess.h>
#include <asm/io.h>
#include <asm/processor.h>
#include <asm/system.h>
#include <asm/pgtable.h>
#include <asm/segment.h>
#include <asm/bitops.h>
#include <asm/atomic.h>
#define DEVFS_VERSION "0.107 (20010709)"
#define DEVFS_VERSION "0.116 (20010919)"
#define DEVFS_NAME "devfs"
#define INODE_TABLE_INC 250
#define FIRST_INODE 1
#define STRING_LENGTH 256
......@@ -557,7 +590,7 @@
# define FALSE 0
#endif
#define IS_HIDDEN(de) (( ((de)->hide && !is_devfsd_or_child(fs_info)) || (!(de)->registered&& !(de)->show_unreg)))
#define IS_HIDDEN(de) (( ((de)->hide && !is_devfsd_or_child(fs_info)) || !(de)->registered))
#define DEBUG_NONE 0x00000
#define DEBUG_MODULE_LOAD 0x00001
......@@ -567,8 +600,8 @@
#define DEBUG_S_PUT 0x00010
#define DEBUG_I_LOOKUP 0x00020
#define DEBUG_I_CREATE 0x00040
#define DEBUG_I_READ 0x00080
#define DEBUG_I_WRITE 0x00100
#define DEBUG_I_GET 0x00080
#define DEBUG_I_CHANGE 0x00100
#define DEBUG_I_UNLINK 0x00200
#define DEBUG_I_RLINK 0x00400
#define DEBUG_I_FLINK 0x00800
......@@ -577,16 +610,12 @@
#define DEBUG_D_DELETE 0x04000
#define DEBUG_D_RELEASE 0x08000
#define DEBUG_D_IPUT 0x10000
#define DEBUG_ALL (DEBUG_MODULE_LOAD | DEBUG_REGISTER | \
DEBUG_SET_FLAGS | DEBUG_I_LOOKUP | \
DEBUG_I_UNLINK | DEBUG_I_MKNOD | \
DEBUG_D_RELEASE | DEBUG_D_IPUT)
#define DEBUG_ALL 0xfffff
#define DEBUG_DISABLED DEBUG_NONE
#define OPTION_NONE 0x00
#define OPTION_SHOW 0x01
#define OPTION_MOUNT 0x02
#define OPTION_ONLY 0x04
#define OPTION_MOUNT 0x01
#define OPTION_ONLY 0x02
#define OOPS(format, args...) {printk (format, ## args); \
printk ("Forcing Oops\n"); \
......@@ -630,8 +659,10 @@ struct fcb_type /* File, char, block type */
struct symlink_type
{
unsigned int length; /* Not including the NULL-termimator */
char *linkname; /* This is NULL-terminated */
atomic_t refcount; /* When this drops to zero, it's unused */
rwlock_t lock; /* Lock around the registered flag */
unsigned int length; /* Not including the NULL-termimator */
char *linkname; /* This is NULL-terminated */
};
struct fifo_type
......@@ -650,7 +681,6 @@ struct devfs_inode /* This structure is for "persistent" inode storage */
umode_t mode;
uid_t uid;
gid_t gid;
nlink_t nlink;
};
struct devfs_entry
......@@ -672,7 +702,6 @@ struct devfs_entry
umode_t mode;
unsigned short namelen; /* I think 64k+ filenames are a way off... */
unsigned char registered:1;
unsigned char show_unreg:1;
unsigned char hide:1;
unsigned char no_persistence:1;
char name[1]; /* This is just a dummy: the allocated array is
......@@ -693,9 +722,6 @@ struct devfsd_buf_entry
struct fs_info /* This structure is for the mounted devfs */
{
unsigned int num_inodes; /* Number of inodes created */
unsigned int table_size; /* Size of the inode pointer table */
struct devfs_entry **table;
struct super_block *sb;
volatile struct devfsd_buf_entry *devfsd_buffer;
spinlock_t devfsd_buffer_lock;
......@@ -765,7 +791,7 @@ static struct devfs_entry *search_for_entry_in_dir (struct devfs_entry *parent,
unsigned int namelen,
int traverse_symlink)
{
struct devfs_entry *curr;
struct devfs_entry *curr, *retval;
if ( !S_ISDIR (parent->mode) )
{
......@@ -781,48 +807,41 @@ static struct devfs_entry *search_for_entry_in_dir (struct devfs_entry *parent,
if (curr == NULL) return NULL;
if (!S_ISLNK (curr->mode) || !traverse_symlink) return curr;
/* Need to follow the link: this is a stack chomper */
return search_for_entry (parent,
curr->u.symlink.linkname, curr->u.symlink.length,
FALSE, FALSE, NULL, TRUE);
read_lock (&curr->u.symlink.lock);
if (!curr->registered)
{
read_unlock (&curr->u.symlink.lock);
return NULL;
}
atomic_inc (&curr->u.symlink.refcount);
read_unlock (&curr->u.symlink.lock);
retval = search_for_entry (parent, curr->u.symlink.linkname,
curr->u.symlink.length, FALSE, FALSE, NULL,
TRUE);
if ( atomic_dec_and_test (&curr->u.symlink.refcount) )
kfree (curr->u.symlink.linkname);
return retval;
} /* End Function search_for_entry_in_dir */
static struct devfs_entry *create_entry (struct devfs_entry *parent,
const char *name,unsigned int namelen)
{
struct devfs_entry *new, **table;
struct devfs_entry *new;
static unsigned long inode_counter = FIRST_INODE;
static spinlock_t counter_lock = SPIN_LOCK_UNLOCKED;
/* First ensure table size is enough */
if (fs_info.num_inodes >= fs_info.table_size)
{
if ( ( table = kmalloc (sizeof *table *
(fs_info.table_size + INODE_TABLE_INC),
GFP_KERNEL) ) == NULL ) return NULL;
fs_info.table_size += INODE_TABLE_INC;
#ifdef CONFIG_DEVFS_DEBUG
if (devfs_debug & DEBUG_I_CREATE)
printk ("%s: create_entry(): grew inode table to: %u entries\n",
DEVFS_NAME, fs_info.table_size);
#endif
if (fs_info.table)
{
memcpy (table, fs_info.table, sizeof *table *fs_info.num_inodes);
kfree (fs_info.table);
}
fs_info.table = table;
}
if ( name && (namelen < 1) ) namelen = strlen (name);
if ( ( new = kmalloc (sizeof *new + namelen, GFP_KERNEL) ) == NULL )
return NULL;
/* Magic: this will set the ctime to zero, thus subsequent lookups will
trigger the call to <update_devfs_inode_from_entry> */
memset (new, 0, sizeof *new + namelen);
spin_lock (&counter_lock);
new->inode.ino = inode_counter++;
spin_unlock (&counter_lock);
new->parent = parent;
if (name) memcpy (new->name, name, namelen);
new->namelen = namelen;
new->inode.ino = fs_info.num_inodes + FIRST_INODE;
new->inode.nlink = 1;
fs_info.table[fs_info.num_inodes] = new;
++fs_info.num_inodes;
if (parent == NULL) return new;
new->prev = parent->u.dir.last;
/* Insert into the parent directory's list of children */
......@@ -1083,14 +1102,10 @@ static struct devfs_entry *get_devfs_entry_from_vfs_inode (struct inode *inode,
int do_check)
{
struct devfs_entry *de;
struct fs_info *fs_info;
if (inode == NULL) return NULL;
if (inode->i_ino < FIRST_INODE) return NULL;
fs_info = inode->i_sb->u.generic_sbp;
if (fs_info == NULL) return NULL;
if (inode->i_ino - FIRST_INODE >= fs_info->num_inodes) return NULL;
de = fs_info->table[inode->i_ino - FIRST_INODE];
de = inode->u.generic_ip;
if (!de) printk (__FUNCTION__ "(): NULL de for inode %ld\n", inode->i_ino);
if (do_check && de && !de->registered) de = NULL;
return de;
} /* End Function get_devfs_entry_from_vfs_inode */
......@@ -1342,12 +1357,12 @@ devfs_handle_t devfs_register (devfs_handle_t dir, const char *name,
return NULL;
}
}
de->u.fcb.autogen = 0;
de->u.fcb.autogen = FALSE;
if ( S_ISCHR (mode) || S_ISBLK (mode) )
{
de->u.fcb.u.device.major = major;
de->u.fcb.u.device.minor = minor;
de->u.fcb.autogen = (devnum == NODEV) ? 0 : 1;
de->u.fcb.autogen = (devnum == NODEV) ? FALSE : TRUE;
}
else if ( S_ISREG (mode) ) de->u.fcb.u.file.size = 0;
else
......@@ -1377,8 +1392,6 @@ devfs_handle_t devfs_register (devfs_handle_t dir, const char *name,
++de->parent->u.dir.num_removable;
}
de->u.fcb.open = FALSE;
de->show_unreg = ( (boot_options & OPTION_SHOW)
|| (flags & DEVFS_FL_SHOW_UNREG) ) ? TRUE : FALSE;
de->hide = (flags & DEVFS_FL_HIDE) ? TRUE : FALSE;
de->no_persistence = (flags & DEVFS_FL_NO_PERSISTENCE) ? TRUE : FALSE;
de->registered = TRUE;
......@@ -1418,13 +1431,16 @@ static void unregister (struct devfs_entry *de)
MKDEV (de->u.fcb.u.device.major,
de->u.fcb.u.device.minor) );
}
de->u.fcb.autogen = 0;
de->u.fcb.autogen = FALSE;
return;
}
if (S_ISLNK (de->mode) && de->registered)
{
write_lock (&de->u.symlink.lock);
de->registered = FALSE;
kfree (de->u.symlink.linkname);
write_unlock (&de->u.symlink.lock);
if ( atomic_dec_and_test (&de->u.symlink.refcount) )
kfree (de->u.symlink.linkname);
return;
}
if ( S_ISFIFO (de->mode) )
......@@ -1475,7 +1491,7 @@ static int devfs_do_symlink (devfs_handle_t dir, const char *name,
{
int is_new;
unsigned int linklength;
char *newname;
char *newlink;
struct devfs_entry *de;
if (handle != NULL) *handle = NULL;
......@@ -1494,49 +1510,32 @@ static int devfs_do_symlink (devfs_handle_t dir, const char *name,
return -EINVAL;
}
linklength = strlen (link);
de = search_for_entry (dir, name, strlen (name), TRUE, TRUE, &is_new,
FALSE);
if (de == NULL) return -ENOMEM;
if (!S_ISLNK (de->mode) && de->registered)
if ( ( newlink = kmalloc (linklength + 1, GFP_KERNEL) ) == NULL )
return -ENOMEM;
memcpy (newlink, link, linklength);
newlink[linklength] = '\0';
if ( ( de = search_for_entry (dir, name, strlen (name), TRUE, TRUE,
&is_new, FALSE) ) == NULL )
{
printk ("%s: devfs_do_symlink(): non-link entry already exists\n",
DEVFS_NAME);
kfree (newlink);
return -ENOMEM;
}
if (de->registered)
{
kfree (newlink);
printk ("%s: devfs_do_symlink(%s): entry already exists\n",
DEVFS_NAME, name);
return -EEXIST;
}
if (handle != NULL) *handle = de;
de->mode = S_IFLNK | S_IRUGO | S_IXUGO;
de->info = info;
de->show_unreg = ( (boot_options & OPTION_SHOW)
|| (flags & DEVFS_FL_SHOW_UNREG) ) ? TRUE : FALSE;
de->hide = (flags & DEVFS_FL_HIDE) ? TRUE : FALSE;
/* Note there is no need to fiddle the dentry cache if the symlink changes
as the symlink follow method is called every time it's needed */
if ( de->registered && (linklength == de->u.symlink.length) )
{
/* New link is same length as old link */
if (memcmp (link, de->u.symlink.linkname, linklength) == 0) return 0;
return -EEXIST; /* Contents would change */
}
/* Have to create/update */
if (de->registered) return -EEXIST;
if ( ( newname = kmalloc (linklength + 1, GFP_KERNEL) ) == NULL )
{
struct devfs_entry *parent = de->parent;
if (!is_new) return -ENOMEM;
/* Have to clean up */
if (de->prev == NULL) parent->u.dir.first = de->next;
else de->prev->next = de->next;
if (de->next == NULL) parent->u.dir.last = de->prev;
else de->next->prev = de->prev;
kfree (de);
return -ENOMEM;
}
de->u.symlink.linkname = newname;
memcpy (de->u.symlink.linkname, link, linklength);
de->u.symlink.linkname[linklength] = '\0';
de->u.symlink.linkname = newlink;
de->u.symlink.length = linklength;
atomic_set (&de->u.symlink.refcount, 1);
rwlock_init (&de->u.symlink.lock);
de->registered = TRUE;
if (handle != NULL) *handle = de;
return 0;
} /* End Function devfs_do_symlink */
......@@ -1621,7 +1620,6 @@ devfs_handle_t devfs_mk_dir (devfs_handle_t dir, const char *name, void *info)
de->mode = S_IFDIR | S_IRUGO | S_IXUGO;
de->info = info;
if (!de->registered) de->u.dir.num_removable = 0;
de->show_unreg = (boot_options & OPTION_SHOW) ? TRUE : FALSE;
de->hide = FALSE;
de->registered = TRUE;
return de;
......@@ -1674,7 +1672,6 @@ int devfs_get_flags (devfs_handle_t de, unsigned int *flags)
if (de == NULL) return -EINVAL;
if (!de->registered) return -ENODEV;
if (de->show_unreg) fl |= DEVFS_FL_SHOW_UNREG;
if (de->hide) fl |= DEVFS_FL_HIDE;
if ( S_ISCHR (de->mode) || S_ISBLK (de->mode) || S_ISREG (de->mode) )
{
......@@ -1704,7 +1701,6 @@ int devfs_set_flags (devfs_handle_t de, unsigned int flags)
printk ("%s: devfs_set_flags(): de->name: \"%s\"\n",
DEVFS_NAME, de->name);
#endif
de->show_unreg = (flags & DEVFS_FL_SHOW_UNREG) ? TRUE : FALSE;
de->hide = (flags & DEVFS_FL_HIDE) ? TRUE : FALSE;
if ( S_ISCHR (de->mode) || S_ISBLK (de->mode) || S_ISREG (de->mode) )
{
......@@ -2053,16 +2049,16 @@ static int __init devfs_setup (char *str)
{"dmod", DEBUG_MODULE_LOAD, &devfs_debug_init},
{"dreg", DEBUG_REGISTER, &devfs_debug_init},
{"dunreg", DEBUG_UNREGISTER, &devfs_debug_init},
{"diread", DEBUG_I_READ, &devfs_debug_init},
{"diget", DEBUG_I_GET, &devfs_debug_init},
{"dchange", DEBUG_SET_FLAGS, &devfs_debug_init},
{"diwrite", DEBUG_I_WRITE, &devfs_debug_init},
{"dichange", DEBUG_I_CHANGE, &devfs_debug_init},
{"dimknod", DEBUG_I_MKNOD, &devfs_debug_init},
{"dilookup", DEBUG_I_LOOKUP, &devfs_debug_init},
{"diunlink", DEBUG_I_UNLINK, &devfs_debug_init},
#endif /* CONFIG_DEVFS_DEBUG */
{"show", OPTION_SHOW, &boot_options},
{"only", OPTION_ONLY, &boot_options},
{"mount", OPTION_MOUNT, &boot_options},
{NULL, 0, NULL}
};
while ( (*str != '\0') && !isspace (*str) )
......@@ -2074,7 +2070,7 @@ static int __init devfs_setup (char *str)
invert = 1;
str += 2;
}
for (i = 0; i < sizeof (devfs_options_tab); i++)
for (i = 0; devfs_options_tab[i].name != NULL; i++)
{
int len = strlen (devfs_options_tab[i].name);
......@@ -2247,122 +2243,37 @@ static struct file_operations devfs_fops;
static struct file_operations devfs_dir_fops;
static struct inode_operations devfs_symlink_iops;
static void devfs_read_inode (struct inode *inode)
{
struct devfs_entry *de;
de = get_devfs_entry_from_vfs_inode (inode, TRUE);
if (de == NULL)
{
printk ("%s: read_inode(%d): VFS inode: %p NO devfs_entry\n",
DEVFS_NAME, (int) inode->i_ino, inode);
return;
}
#ifdef CONFIG_DEVFS_DEBUG
if (devfs_debug & DEBUG_I_READ)
printk ("%s: read_inode(%d): VFS inode: %p devfs_entry: %p\n",
DEVFS_NAME, (int) inode->i_ino, inode, de);
#endif
inode->i_blocks = 0;
inode->i_blksize = 1024;
inode->i_op = &devfs_iops;
inode->i_fop = &devfs_fops;
inode->i_rdev = NODEV;
if ( S_ISCHR (de->inode.mode) )
{
inode->i_rdev = MKDEV (de->u.fcb.u.device.major,
de->u.fcb.u.device.minor);
inode->i_cdev = cdget (kdev_t_to_nr(inode->i_rdev));
}
else if ( S_ISBLK (de->inode.mode) )
{
inode->i_rdev = MKDEV (de->u.fcb.u.device.major,
de->u.fcb.u.device.minor);
inode->i_bdev = bdget (kdev_t_to_nr(inode->i_rdev));
if (inode->i_bdev)
{
if (!inode->i_bdev->bd_op && de->u.fcb.ops)
inode->i_bdev->bd_op = de->u.fcb.ops;
}
else printk ("%s: read_inode(%d): no block device from bdget()\n",
DEVFS_NAME, (int) inode->i_ino);
}
else if ( S_ISFIFO (de->inode.mode) ) inode->i_fop = &def_fifo_fops;
else if ( S_ISREG (de->inode.mode) ) inode->i_size = de->u.fcb.u.file.size;
else if ( S_ISDIR (de->inode.mode) )
{
inode->i_op = &devfs_dir_iops;
inode->i_fop = &devfs_dir_fops;
}
else if ( S_ISLNK (de->inode.mode) )
{
inode->i_op = &devfs_symlink_iops;
inode->i_size = de->u.symlink.length;
}
inode->i_mode = de->inode.mode;
inode->i_uid = de->inode.uid;
inode->i_gid = de->inode.gid;
inode->i_atime = de->inode.atime;
inode->i_mtime = de->inode.mtime;
inode->i_ctime = de->inode.ctime;
inode->i_nlink = de->inode.nlink;
#ifdef CONFIG_DEVFS_DEBUG
if (devfs_debug & DEBUG_I_READ)
printk ("%s: mode: 0%o uid: %d gid: %d\n",
DEVFS_NAME, (int) inode->i_mode,
(int) inode->i_uid, (int) inode->i_gid);
#endif
} /* End Function devfs_read_inode */
static void devfs_write_inode (struct inode *inode, int wait)
static int devfs_notify_change (struct dentry *dentry, struct iattr *iattr)
{
int index;
int retval;
struct devfs_entry *de;
struct inode *inode = dentry->d_inode;
struct fs_info *fs_info = inode->i_sb->u.generic_sbp;
if (inode->i_ino < FIRST_INODE) return;
index = inode->i_ino - FIRST_INODE;
lock_kernel ();
if (index >= fs_info->num_inodes)
{
printk ("%s: writing inode: %lu for which there is no entry!\n",
DEVFS_NAME, inode->i_ino);
unlock_kernel ();
return;
}
de = fs_info->table[index];
de = get_devfs_entry_from_vfs_inode (inode, TRUE);
if (de == NULL) return -ENODEV;
retval = inode_change_ok (inode, iattr);
if (retval != 0) return retval;
retval = inode_setattr (inode, iattr);
if (retval != 0) return retval;
#ifdef CONFIG_DEVFS_DEBUG
if (devfs_debug & DEBUG_I_WRITE)
if (devfs_debug & DEBUG_I_CHANGE)
{
printk ("%s: write_inode(%d): VFS inode: %p devfs_entry: %p\n",
printk ("%s: notify_change(%d): VFS inode: %p devfs_entry: %p\n",
DEVFS_NAME, (int) inode->i_ino, inode, de);
printk ("%s: mode: 0%o uid: %d gid: %d\n",
DEVFS_NAME, (int) inode->i_mode,
(int) inode->i_uid, (int) inode->i_gid);
}
#endif
/* Inode is not on hash chains, thus must save permissions here rather
than in a write_inode() method */
de->inode.mode = inode->i_mode;
de->inode.uid = inode->i_uid;
de->inode.gid = inode->i_gid;
de->inode.atime = inode->i_atime;
de->inode.mtime = inode->i_mtime;
de->inode.ctime = inode->i_ctime;
unlock_kernel ();
} /* End Function devfs_write_inode */
static int devfs_notify_change (struct dentry *dentry, struct iattr *iattr)
{
int retval;
struct devfs_entry *de;
struct inode *inode = dentry->d_inode;
struct fs_info *fs_info = inode->i_sb->u.generic_sbp;
de = get_devfs_entry_from_vfs_inode (inode, TRUE);
if (de == NULL) return -ENODEV;
retval = inode_change_ok (inode, iattr);
if (retval != 0) return retval;
retval = inode_setattr (inode, iattr);
if (retval != 0) return retval;
if ( iattr->ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID) )
devfsd_notify_one (de, DEVFSD_NOTIFY_CHANGE, inode->i_mode,
inode->i_uid, inode->i_gid, fs_info);
......@@ -2382,8 +2293,7 @@ static int devfs_statfs (struct super_block *sb, struct statfs *buf)
static struct super_operations devfs_sops =
{
read_inode: devfs_read_inode,
write_inode: devfs_write_inode,
put_inode: force_delete,
statfs: devfs_statfs,
};
......@@ -2412,8 +2322,69 @@ static struct inode *get_vfs_inode (struct super_block *sb,
printk (" old inode: %p\n", de->inode.dentry->d_inode);
return NULL;
}
if ( ( inode = iget (sb, de->inode.ino) ) == NULL ) return NULL;
if ( ( inode = new_inode (sb) ) == NULL )
{
printk ("%s: get_vfs_inode(%s): new_inode() failed, de: %p\n",
DEVFS_NAME, de->name, de);
return NULL;
}
de->inode.dentry = dentry;
inode->u.generic_ip = de;
inode->i_ino = de->inode.ino;
#ifdef CONFIG_DEVFS_DEBUG
if (devfs_debug & DEBUG_I_GET)
printk ("%s: get_vfs_inode(%d): VFS inode: %p devfs_entry: %p\n",
DEVFS_NAME, (int) inode->i_ino, inode, de);
#endif
inode->i_blocks = 0;
inode->i_blksize = 1024;
inode->i_op = &devfs_iops;
inode->i_fop = &devfs_fops;
inode->i_rdev = NODEV;
if ( S_ISCHR (de->inode.mode) )
{
inode->i_rdev = MKDEV (de->u.fcb.u.device.major,
de->u.fcb.u.device.minor);
inode->i_cdev = cdget (kdev_t_to_nr(inode->i_rdev));
}
else if ( S_ISBLK (de->inode.mode) )
{
inode->i_rdev = MKDEV (de->u.fcb.u.device.major,
de->u.fcb.u.device.minor);
inode->i_bdev = bdget ( kdev_t_to_nr (inode->i_rdev) );
inode->i_mapping->a_ops = &def_blk_aops;
if (inode->i_bdev)
{
if (!inode->i_bdev->bd_op && de->u.fcb.ops)
inode->i_bdev->bd_op = de->u.fcb.ops;
}
else printk ("%s: get_vfs_inode(%d): no block device from bdget()\n",
DEVFS_NAME, (int) inode->i_ino);
}
else if ( S_ISFIFO (de->inode.mode) ) inode->i_fop = &def_fifo_fops;
else if ( S_ISREG (de->inode.mode) ) inode->i_size = de->u.fcb.u.file.size;
else if ( S_ISDIR (de->inode.mode) )
{
inode->i_op = &devfs_dir_iops;
inode->i_fop = &devfs_dir_fops;
}
else if ( S_ISLNK (de->inode.mode) )
{
inode->i_op = &devfs_symlink_iops;
inode->i_size = de->u.symlink.length;
}
inode->i_mode = de->inode.mode;
inode->i_uid = de->inode.uid;
inode->i_gid = de->inode.gid;
inode->i_atime = de->inode.atime;
inode->i_mtime = de->inode.mtime;
inode->i_ctime = de->inode.ctime;
#ifdef CONFIG_DEVFS_DEBUG
if (devfs_debug & DEBUG_I_GET)
printk ("%s: mode: 0%o uid: %d gid: %d\n",
DEVFS_NAME, (int) inode->i_mode,
(int) inode->i_uid, (int) inode->i_gid);
#endif
return inode;
} /* End Function get_vfs_inode */
......@@ -2726,14 +2697,14 @@ static struct dentry *devfs_lookup (struct inode *dir, struct dentry *dentry)
memcpy (txt, dentry->d_name.name,
(dentry->d_name.len >= STRING_LENGTH) ?
(STRING_LENGTH - 1) : dentry->d_name.len);
#ifdef CONFIG_DEVFS_DEBUG
if (devfs_debug & DEBUG_I_LOOKUP)
printk ("%s: lookup(%s): dentry: %p by: \"%s\"\n",
DEVFS_NAME, txt, dentry, current->comm);
#endif
fs_info = dir->i_sb->u.generic_sbp;
/* First try to get the devfs entry for this directory */
parent = get_devfs_entry_from_vfs_inode (dir, TRUE);
#ifdef CONFIG_DEVFS_DEBUG
if (devfs_debug & DEBUG_I_LOOKUP)
printk ("%s: lookup(%s): dentry: %p parent: %p by: \"%s\"\n",
DEVFS_NAME, txt, dentry, parent, current->comm);
#endif
if (parent == NULL) return ERR_PTR (-ENOENT);
/* Try to reclaim an existing devfs entry */
de = search_for_entry_in_dir (parent,
......@@ -2824,6 +2795,7 @@ static int devfs_link (struct dentry *old_dentry, struct inode *dir,
static int devfs_unlink (struct inode *dir, struct dentry *dentry)
{
struct devfs_entry *de;
struct inode *inode = dentry->d_inode;
#ifdef CONFIG_DEVFS_DEBUG
char txt[STRING_LENGTH];
......@@ -2839,9 +2811,18 @@ static int devfs_unlink (struct inode *dir, struct dentry *dentry)
de = get_devfs_entry_from_vfs_inode (dentry->d_inode, TRUE);
if (de == NULL) return -ENOENT;
de->registered = FALSE;
devfsd_notify_one (de, DEVFSD_NOTIFY_DELETE, inode->i_mode,
inode->i_uid, inode->i_gid, dir->i_sb->u.generic_sbp);
if ( S_ISLNK (de->mode) )
{
write_lock (&de->u.symlink.lock);
de->registered = FALSE;
write_unlock (&de->u.symlink.lock);
if ( atomic_dec_and_test (&de->u.symlink.refcount) )
kfree (de->u.symlink.linkname);
}
else de->registered = FALSE;
de->hide = TRUE;
if ( S_ISLNK (de->mode) ) kfree (de->u.symlink.linkname);
free_dentries (de);
return 0;
} /* End Function devfs_unlink */
......@@ -2955,6 +2936,8 @@ static int devfs_rmdir (struct inode *dir, struct dentry *dentry)
}
}
if (has_children) return -ENOTEMPTY;
devfsd_notify_one (de, DEVFSD_NOTIFY_DELETE, inode->i_mode,
inode->i_uid, inode->i_gid, fs_info);
de->hide = TRUE;
de->registered = FALSE;
free_dentries (de);
......@@ -2990,29 +2973,29 @@ static int devfs_mknod (struct inode *dir, struct dentry *dentry, int mode,
de = search_for_entry (parent, dentry->d_name.name, dentry->d_name.len,
FALSE, TRUE, &is_new, FALSE);
if (de == NULL) return -ENOMEM;
if (!de->registered)
if (de->registered)
{
/* Since we created the devfs entry we get to choose things */
de->info = NULL;
de->mode = mode;
if ( S_ISBLK (mode) || S_ISCHR (mode) )
{
de->u.fcb.u.device.major = MAJOR (rdev);
de->u.fcb.u.device.minor = MINOR (rdev);
de->u.fcb.default_uid = current->euid;
de->u.fcb.default_gid = current->egid;
de->u.fcb.ops = NULL;
de->u.fcb.auto_owner = FALSE;
de->u.fcb.aopen_notify = FALSE;
de->u.fcb.open = FALSE;
}
else if ( S_ISFIFO (mode) )
{
de->u.fifo.uid = current->euid;
de->u.fifo.gid = current->egid;
}
printk ("%s: mknod(): existing entry\n", DEVFS_NAME);
return -EEXIST;
}
de->info = NULL;
de->mode = mode;
if ( S_ISBLK (mode) || S_ISCHR (mode) )
{
de->u.fcb.u.device.major = MAJOR (rdev);
de->u.fcb.u.device.minor = MINOR (rdev);
de->u.fcb.default_uid = current->euid;
de->u.fcb.default_gid = current->egid;
de->u.fcb.ops = NULL;
de->u.fcb.auto_owner = FALSE;
de->u.fcb.aopen_notify = FALSE;
de->u.fcb.open = FALSE;
}
else if ( S_ISFIFO (mode) )
{
de->u.fifo.uid = current->euid;
de->u.fifo.gid = current->egid;
}
de->show_unreg = FALSE;
de->hide = FALSE;
de->inode.mode = mode;
de->inode.uid = current->euid;
......@@ -3039,11 +3022,19 @@ static int devfs_readlink (struct dentry *dentry, char *buffer, int buflen)
int err;
struct devfs_entry *de;
lock_kernel ();
de = get_devfs_entry_from_vfs_inode (dentry->d_inode, TRUE);
err = de ? vfs_readlink (dentry, buffer, buflen,
de->u.symlink.linkname) : -ENODEV;
unlock_kernel ();
if (!de) return -ENODEV;
read_lock (&de->u.symlink.lock);
if (!de->registered)
{
read_unlock (&de->u.symlink.lock);
return -ENODEV;
}
atomic_inc (&de->u.symlink.refcount);
read_unlock (&de->u.symlink.lock);
err = vfs_readlink (dentry, buffer, buflen, de->u.symlink.linkname);
if ( atomic_dec_and_test (&de->u.symlink.refcount) )
kfree (de->u.symlink.linkname);
return err;
} /* End Function devfs_readlink */
......@@ -3052,10 +3043,19 @@ static int devfs_follow_link (struct dentry *dentry, struct nameidata *nd)
int err;
struct devfs_entry *de;
lock_kernel ();
de = get_devfs_entry_from_vfs_inode (dentry->d_inode, TRUE);
err = de ? vfs_follow_link (nd, de->u.symlink.linkname) : -ENODEV;
unlock_kernel ();
if (!de) return -ENODEV;
read_lock (&de->u.symlink.lock);
if (!de->registered)
{
read_unlock (&de->u.symlink.lock);
return -ENODEV;
}
atomic_inc (&de->u.symlink.refcount);
read_unlock (&de->u.symlink.lock);
err = vfs_follow_link (nd, de->u.symlink.linkname);
if ( atomic_dec_and_test (&de->u.symlink.refcount) )
kfree (de->u.symlink.linkname);
return err;
} /* End Function devfs_follow_link */
......
......@@ -39,6 +39,15 @@
Created <devfs_*alloc_major> and <devfs_*alloc_devnum>.
20010710 Richard Gooch <rgooch@atnf.csiro.au>
Created <devfs_*alloc_unique_number>.
20010730 Richard Gooch <rgooch@atnf.csiro.au>
Documentation typo fix.
20010806 Richard Gooch <rgooch@atnf.csiro.au>
Made <block_semaphore> and <char_semaphore> private.
20010813 Richard Gooch <rgooch@atnf.csiro.au>
Fixed bug in <devfs_alloc_unique_number>: limited to 128 numbers
20010818 Richard Gooch <rgooch@atnf.csiro.au>
Updated major masks up to Linus' "no new majors" proclamation.
Block: were 126 now 122 free, char: were 26 now 19 free.
*/
#include <linux/module.h>
#include <linux/init.h>
......@@ -181,15 +190,15 @@ struct major_list
};
/* Block majors already assigned:
0-3, 7-9, 11-12, 13-63, 65-93, 95-99, 101, 103-111, 120-127, 199, 201,
240-255
0-3, 7-9, 11-63, 65-99, 101-113, 120-127, 199, 201, 240-255
Total free: 122
*/
static struct major_list block_major_list =
{SPIN_LOCK_UNLOCKED,
{0xfffffb8f, /* Majors 0 to 31 */
0xffffffff, /* Majors 32 to 63 */
0xbffffffe, /* Majors 64 to 95 */
0xff00ffaf, /* Majors 96 to 127 */
0xfffffffe, /* Majors 64 to 95 */
0xff03ffef, /* Majors 96 to 127 */
0x00000000, /* Majors 128 to 159 */
0x00000000, /* Majors 160 to 191 */
0x00000280, /* Majors 192 to 223 */
......@@ -197,7 +206,8 @@ static struct major_list block_major_list =
};
/* Char majors already assigned:
0-7, 9-151, 154-158, 160-195, 198-211, 216-221, 224-225, 240-255
0-7, 9-151, 154-158, 160-211, 216-221, 224-230, 240-255
Total free: 19
*/
static struct major_list char_major_list =
{SPIN_LOCK_UNLOCKED,
......@@ -207,14 +217,14 @@ static struct major_list char_major_list =
0xffffffff, /* Majors 96 to 127 */
0x7cffffff, /* Majors 128 to 159 */
0xffffffff, /* Majors 160 to 191 */
0x3f0fffcf, /* Majors 192 to 223 */
0xffff0003} /* Majors 224 to 255 */
0x3f0fffff, /* Majors 192 to 223 */
0xffff007f} /* Majors 224 to 255 */
};
/**
* devfs_alloc_major - Allocate a major number.
* @type: The type of the major (DEVFS_SPECIAL_CHR or DEVFS_SPECIAL_BLOCK)
* @type: The type of the major (DEVFS_SPECIAL_CHR or DEVFS_SPECIAL_BLK)
* Returns the allocated major, else -1 if none are available.
* This routine is thread safe and does not block.
......@@ -238,7 +248,7 @@ EXPORT_SYMBOL(devfs_alloc_major);
/**
* devfs_dealloc_major - Deallocate a major number.
* @type: The type of the major (DEVFS_SPECIAL_CHR or DEVFS_SPECIAL_BLOCK)
* @type: The type of the major (DEVFS_SPECIAL_CHR or DEVFS_SPECIAL_BLK)
* @major: The major number.
* This routine is thread safe and does not block.
*/
......@@ -273,16 +283,16 @@ struct device_list
int none_free;
};
DECLARE_MUTEX (block_semaphore);
static DECLARE_MUTEX (block_semaphore);
static struct device_list block_list;
DECLARE_MUTEX (char_semaphore);
static DECLARE_MUTEX (char_semaphore);
static struct device_list char_list;
/**
* devfs_alloc_devnum - Allocate a device number.
* @type: The type (DEVFS_SPECIAL_CHR or DEVFS_SPECIAL_BLOCK).
* @type: The type (DEVFS_SPECIAL_CHR or DEVFS_SPECIAL_BLK).
*
* Returns the allocated device number, else NODEV if none are available.
* This routine is thread safe and may block.
......@@ -347,7 +357,7 @@ EXPORT_SYMBOL(devfs_alloc_devnum);
/**
* devfs_dealloc_devnum - Dellocate a device number.
* @type: The type (DEVFS_SPECIAL_CHR or DEVFS_SPECIAL_BLOCK).
* @type: The type (DEVFS_SPECIAL_CHR or DEVFS_SPECIAL_BLK).
* @devnum: The device number.
*
* This routine is thread safe and does not block.
......@@ -437,6 +447,7 @@ int devfs_alloc_unique_number (struct unique_numspace *space)
space->length = length;
}
number = find_first_zero_bit (space->bits, space->length << 3);
--space->num_free;
__set_bit (number, space->bits);
up (&space->semaphore);
return number;
......
......@@ -20,6 +20,7 @@
#define DEVFSD_NOTIFY_LOOKUP 4
#define DEVFSD_NOTIFY_CHANGE 5
#define DEVFSD_NOTIFY_CREATE 6
#define DEVFSD_NOTIFY_DELETE 7
#define DEVFS_PATHLEN 1024 /* Never change this otherwise the
binary interface will change */
......
......@@ -30,17 +30,15 @@
is closed, ownership reverts back to
<<uid>> and <<gid>> and the protection
is set to read-write for all */
#define DEVFS_FL_SHOW_UNREG 0x002 /* Show unregistered entries in
directory listings */
#define DEVFS_FL_HIDE 0x004 /* Do not show entry in directory list */
#define DEVFS_FL_AUTO_DEVNUM 0x008 /* Automatically generate device number
#define DEVFS_FL_HIDE 0x002 /* Do not show entry in directory list */
#define DEVFS_FL_AUTO_DEVNUM 0x004 /* Automatically generate device number
*/
#define DEVFS_FL_AOPEN_NOTIFY 0x010 /* Asynchronously notify devfsd on open
#define DEVFS_FL_AOPEN_NOTIFY 0x008 /* Asynchronously notify devfsd on open
*/
#define DEVFS_FL_REMOVABLE 0x020 /* This is a removable media device */
#define DEVFS_FL_WAIT 0x040 /* Wait for devfsd to finish */
#define DEVFS_FL_NO_PERSISTENCE 0x080 /* Forget changes after unregister */
#define DEVFS_FL_CURRENT_OWNER 0x100 /* Set initial ownership to current */
#define DEVFS_FL_REMOVABLE 0x010 /* This is a removable media device */
#define DEVFS_FL_WAIT 0x020 /* Wait for devfsd to finish */
#define DEVFS_FL_NO_PERSISTENCE 0x040 /* Forget changes after unregister */
#define DEVFS_FL_CURRENT_OWNER 0x080 /* Set initial ownership to current */
#define DEVFS_FL_DEFAULT DEVFS_FL_NONE
......
......@@ -135,15 +135,13 @@ struct rpc_xprt {
struct rpc_wait_queue sending; /* requests waiting to send */
struct rpc_wait_queue pending; /* requests in flight */
struct rpc_wait_queue backlog; /* waiting for slot */
struct rpc_wait_queue reconn; /* waiting for reconnect */
struct rpc_rqst * free; /* free slots */
struct rpc_rqst slot[RPC_MAXREQS];
unsigned long sockstate; /* Socket state */
unsigned char shutdown : 1, /* being shut down */
nocong : 1, /* no congestion control */
stream : 1, /* TCP */
tcp_more : 1, /* more record fragments */
connecting : 1; /* being reconnected */
tcp_more : 1; /* more record fragments */
/*
* State of TCP reply receive stuff
......@@ -158,6 +156,8 @@ struct rpc_xprt {
/*
* Send stuff
*/
spinlock_t sock_lock; /* lock socket info */
spinlock_t xprt_lock; /* lock xprt info */
struct rpc_task * snd_task; /* Task blocked in send */
......@@ -185,10 +185,9 @@ int xprt_adjust_timeout(struct rpc_timeout *);
void xprt_release(struct rpc_task *);
void xprt_reconnect(struct rpc_task *);
int xprt_clear_backlog(struct rpc_xprt *);
int xprt_tcp_pending(void);
void __rpciod_tcp_dispatcher(void);
extern struct list_head rpc_xprt_pending;
#define XPRT_WSPACE 0
#define XPRT_CONNECT 1
......@@ -201,12 +200,6 @@ extern struct list_head rpc_xprt_pending;
#define xprt_test_and_set_connected(xp) (test_and_set_bit(XPRT_CONNECT, &(xp)->sockstate))
#define xprt_clear_connected(xp) (clear_bit(XPRT_CONNECT, &(xp)->sockstate))
static inline
int xprt_tcp_pending(void)
{
return !list_empty(&rpc_xprt_pending);
}
static inline
void rpciod_tcp_dispatcher(void)
{
......
......@@ -170,6 +170,8 @@ extern unsigned long swap_cache_find_success;
extern spinlock_t pagemap_lru_lock;
extern void FASTCALL(mark_page_accessed(struct page *));
/*
* List add/del helper macros. These must be called
* with the pagemap_lru_lock held!
......
......@@ -86,19 +86,18 @@ lookup_exec_domain(u_long personality)
if (try_inc_mod_count(ep->module))
goto out;
}
read_unlock(&exec_domains_lock);
#ifdef CONFIG_KMOD
read_unlock(&exec_domains_lock);
sprintf(buffer, "personality-%ld", pers);
request_module(buffer);
read_lock(&exec_domains_lock);
for (ep = exec_domains; ep; ep = ep->next) {
if (pers >= ep->pers_low && pers <= ep->pers_high)
if (try_inc_mod_count(ep->module))
goto out;
}
read_unlock(&exec_domains_lock);
#endif
ep = &default_exec_domain;
......
......@@ -420,8 +420,6 @@ static inline struct page * __find_page_nolock(struct address_space *mapping, un
break;
}
SetPageReferenced(page);
not_found:
return page;
}
......@@ -1086,6 +1084,26 @@ static void generic_file_readahead(int reada_ok,
return;
}
/*
* Mark a page as having seen activity.
*
* If it was already so marked, move it
* to the active queue and drop the referenced
* bit. Otherwise, just mark it for future
* action..
*/
void mark_page_accessed(struct page *page)
{
if (!PageActive(page) && PageReferenced(page)) {
activate_page(page);
ClearPageReferenced(page);
return;
}
/* Mark the page referenced, AFTER checking for previous usage.. */
SetPageReferenced(page);
}
/*
* This is a generic file read routine, and uses the
* inode->i_op->readpage() function for the actual low-level
......@@ -1203,6 +1221,7 @@ void do_generic_file_read(struct file * filp, loff_t *ppos, read_descriptor_t *
index += offset >> PAGE_CACHE_SHIFT;
offset &= ~PAGE_CACHE_MASK;
mark_page_accessed(page);
page_cache_release(page);
if (ret == nr && desc->count)
continue;
......@@ -2534,6 +2553,7 @@ struct page *__read_cache_page(struct address_space *mapping,
}
if (cached_page)
page_cache_release(cached_page);
mark_page_accessed(page);
return page;
}
......
......@@ -82,8 +82,13 @@ void __free_pte(pte_t pte)
* free_page() used to be able to clear swap cache
* entries. We may now have to do it manually.
*/
if (pte_dirty(pte) && page->mapping)
set_page_dirty(page);
if (page->mapping) {
if (pte_dirty(pte))
set_page_dirty(page);
if (pte_young(pte))
mark_page_accessed(page);
}
free_page_and_swap_cache(page);
}
......
......@@ -340,11 +340,15 @@ struct page * __alloc_pages(unsigned int gfp_mask, unsigned int order, zonelist_
zone = zonelist->zones;
for (;;) {
unsigned long min;
zone_t *z = *(zone++);
if (!z)
break;
if (zone_free_pages(z, order) > (gfp_mask & __GFP_HIGH ? z->pages_min / 2 : z->pages_min)) {
min = z->pages_min;
if (!(gfp_mask & __GFP_WAIT))
min >>= 2;
if (zone_free_pages(z, order) > min) {
page = rmqueue(z, order);
if (page)
return page;
......
......@@ -52,10 +52,12 @@ static inline int try_to_swap_out(struct mm_struct * mm, struct vm_area_struct*
/* Don't look at this pte if it's been accessed recently. */
if (ptep_test_and_clear_young(page_table)) {
flush_tlb_page(vma, address);
activate_page(page);
mark_page_accessed(page);
return 0;
}
if ((PageInactive(page) || PageActive(page)) && PageReferenced(page))
/* Don't bother with it if the page is otherwise active */
if (PageActive(page))
return 0;
if (TryLockPage(page))
......@@ -91,8 +93,6 @@ static inline int try_to_swap_out(struct mm_struct * mm, struct vm_area_struct*
UnlockPage(page);
{
int freeable = page_count(page) - !!page->buffers <= 2;
if (freeable)
deactivate_page(page);
page_cache_release(page);
return freeable & right_classzone;
}
......@@ -355,31 +355,10 @@ static int shrink_cache(struct list_head * lru, int * max_scan, int this_max_sca
this_max_scan--;
if (PageTestandClearReferenced(page)) {
if (!PageSwapCache(page)) {
if (PageInactive(page)) {
del_page_from_inactive_list(page);
add_page_to_active_list(page);
} else if (PageActive(page)) {
list_del(entry);
list_add(entry, &active_list);
} else
BUG();
} else {
list_del(entry);
list_add(entry, lru);
}
list_del(entry);
list_add(entry, lru);
if (PageTestandClearReferenced(page))
continue;
}
if (PageInactive(page)) {
/* just roll it over, no need to update any stat */
list_del(entry);
list_add(entry, &inactive_list);
} else {
del_page_from_active_list(page);
add_page_to_inactive_list(page);
}
if (unlikely(!memclass(page->zone, classzone)))
continue;
......@@ -387,10 +366,8 @@ static int shrink_cache(struct list_head * lru, int * max_scan, int this_max_sca
__max_scan--;
/* Racy check to avoid trylocking when not worthwhile */
if (!page->buffers && page_count(page) != 1) {
activate_page_nolock(page);
if (!page->buffers && page_count(page) != 1)
continue;
}
/*
* The page is locked. IO in progress?
......@@ -547,10 +524,17 @@ static void balance_inactive(int nr_pages)
page = list_entry(entry, struct page, lru);
entry = entry->prev;
if (PageTestandClearReferenced(page))
continue;
del_page_from_active_list(page);
add_page_to_inactive_list(page);
}
/* move active list to between "entry" and "entry->next" */
__list_del(active_list.prev, active_list.next);
__list_add(&active_list, entry, entry->next);
spin_unlock(&pagemap_lru_lock);
}
......@@ -568,10 +552,6 @@ static int shrink_caches(int priority, zone_t * classzone, unsigned int gfp_mask
if (nr_pages <= 0)
return 0;
nr_pages = shrink_cache(&active_list, &max_scan, nr_active_pages / DEF_PRIORITY, nr_pages, classzone, gfp_mask);
if (nr_pages <= 0)
return 0;
shrink_dcache_memory(priority, gfp_mask);
shrink_icache_memory(priority, gfp_mask);
......
......@@ -69,7 +69,7 @@ rpcauth_destroy(struct rpc_auth *auth)
auth->au_ops->destroy(auth);
}
spinlock_t rpc_credcache_lock = SPIN_LOCK_UNLOCKED;
static spinlock_t rpc_credcache_lock = SPIN_LOCK_UNLOCKED;
/*
* Initialize RPC credential cache
......
......@@ -55,6 +55,8 @@ static void call_refresh(struct rpc_task *task);
static void call_refreshresult(struct rpc_task *task);
static void call_timeout(struct rpc_task *task);
static void call_reconnect(struct rpc_task *task);
static void child_reconnect(struct rpc_task *);
static void child_reconnect_status(struct rpc_task *);
static u32 * call_header(struct rpc_task *task);
static u32 * call_verify(struct rpc_task *task);
......@@ -525,6 +527,7 @@ static void
call_reconnect(struct rpc_task *task)
{
struct rpc_clnt *clnt = task->tk_client;
struct rpc_task *child;
dprintk("RPC: %4d call_reconnect status %d\n",
task->tk_pid, task->tk_status);
......@@ -532,10 +535,31 @@ call_reconnect(struct rpc_task *task)
task->tk_action = call_transmit;
if (task->tk_status < 0 || !clnt->cl_xprt->stream)
return;
clnt->cl_stats->netreconn++;
/* Run as a child to ensure it runs as an rpciod task */
child = rpc_new_child(clnt, task);
if (child) {
child->tk_action = child_reconnect;
rpc_run_child(task, child, NULL);
}
}
static void child_reconnect(struct rpc_task *task)
{
task->tk_client->cl_stats->netreconn++;
task->tk_status = 0;
task->tk_action = child_reconnect_status;
xprt_reconnect(task);
}
static void child_reconnect_status(struct rpc_task *task)
{
if (task->tk_status == -EAGAIN)
task->tk_action = child_reconnect;
else
task->tk_action = NULL;
}
/*
* 5. Transmit the RPC request, and wait for reply
*/
......
......@@ -31,7 +31,7 @@
static struct rpc_clnt * pmap_create(char *, struct sockaddr_in *, int);
static void pmap_getport_done(struct rpc_task *);
extern struct rpc_program pmap_program;
spinlock_t pmap_lock = SPIN_LOCK_UNLOCKED;
static spinlock_t pmap_lock = SPIN_LOCK_UNLOCKED;
/*
* Obtain the port for a given RPC service on a given host. This one can
......
......@@ -76,7 +76,7 @@ spinlock_t rpc_queue_lock = SPIN_LOCK_UNLOCKED;
/*
* Spinlock for other critical sections of code.
*/
spinlock_t rpc_sched_lock = SPIN_LOCK_UNLOCKED;
static spinlock_t rpc_sched_lock = SPIN_LOCK_UNLOCKED;
/*
* This is the last-ditch buffer for NFS swap requests
......
......@@ -75,10 +75,6 @@ extern spinlock_t rpc_queue_lock;
* Local variables
*/
/* Spinlock for critical sections in the code. */
spinlock_t xprt_sock_lock = SPIN_LOCK_UNLOCKED;
spinlock_t xprt_lock = SPIN_LOCK_UNLOCKED;
#ifdef RPC_DEBUG
# undef RPC_DEBUG_DATA
# define RPCDBG_FACILITY RPCDBG_XPRT
......@@ -171,6 +167,44 @@ xprt_move_iov(struct msghdr *msg, struct iovec *niv, unsigned amount)
msg->msg_iov=niv;
}
/*
* Serialize write access to sockets, in order to prevent different
* requests from interfering with each other.
* Also prevents TCP socket reconnections from colliding with writes.
*/
static int
xprt_lock_write(struct rpc_xprt *xprt, struct rpc_task *task)
{
int retval;
spin_lock_bh(&xprt->sock_lock);
if (!xprt->snd_task)
xprt->snd_task = task;
else if (xprt->snd_task != task) {
dprintk("RPC: %4d TCP write queue full (task %d)\n",
task->tk_pid, xprt->snd_task->tk_pid);
task->tk_timeout = 0;
task->tk_status = -EAGAIN;
rpc_sleep_on(&xprt->sending, task, NULL, NULL);
}
retval = xprt->snd_task == task;
spin_unlock_bh(&xprt->sock_lock);
return retval;
}
/*
* Releases the socket for use by other requests.
*/
static void
xprt_release_write(struct rpc_xprt *xprt, struct rpc_task *task)
{
spin_lock_bh(&xprt->sock_lock);
if (xprt->snd_task == task) {
xprt->snd_task = NULL;
rpc_wake_up_next(&xprt->sending);
}
spin_unlock_bh(&xprt->sock_lock);
}
/*
* Write data to socket.
*/
......@@ -285,7 +319,10 @@ xprt_adjust_cwnd(struct rpc_xprt *xprt, int result)
if (xprt->nocong)
return;
spin_lock_bh(&xprt_sock_lock);
/*
* Note: we're in a BH context
*/
spin_lock(&xprt->xprt_lock);
cwnd = xprt->cwnd;
if (result >= 0) {
if (xprt->cong < cwnd || time_before(jiffies, xprt->congtime))
......@@ -313,7 +350,7 @@ xprt_adjust_cwnd(struct rpc_xprt *xprt, int result)
xprt->cwnd = cwnd;
out:
spin_unlock_bh(&xprt_sock_lock);
spin_unlock(&xprt->xprt_lock);
}
/*
......@@ -394,6 +431,8 @@ xprt_disconnect(struct rpc_xprt *xprt)
/*
* Reconnect a broken TCP connection.
*
* Note: This cannot collide with the TCP reads, as both run from rpciod
*/
void
xprt_reconnect(struct rpc_task *task)
......@@ -416,15 +455,10 @@ xprt_reconnect(struct rpc_task *task)
return;
}
spin_lock(&xprt_lock);
if (xprt->connecting) {
task->tk_timeout = 0;
rpc_sleep_on(&xprt->reconn, task, NULL, NULL);
spin_unlock(&xprt_lock);
if (!xprt_lock_write(xprt, task))
return;
}
xprt->connecting = 1;
spin_unlock(&xprt_lock);
if (xprt_connected(xprt))
goto out_write;
status = -ENOTCONN;
if (!inet) {
......@@ -439,6 +473,7 @@ xprt_reconnect(struct rpc_task *task)
/* Reset TCP record info */
xprt->tcp_offset = 0;
xprt->tcp_reclen = 0;
xprt->tcp_copied = 0;
xprt->tcp_more = 0;
......@@ -467,24 +502,22 @@ xprt_reconnect(struct rpc_task *task)
dprintk("RPC: %4d connect status %d connected %d\n",
task->tk_pid, status, xprt_connected(xprt));
spin_lock_bh(&xprt_sock_lock);
spin_lock_bh(&xprt->sock_lock);
if (!xprt_connected(xprt)) {
task->tk_timeout = xprt->timeout.to_maxval;
rpc_sleep_on(&xprt->reconn, task, xprt_reconn_status, NULL);
spin_unlock_bh(&xprt_sock_lock);
rpc_sleep_on(&xprt->sending, task, xprt_reconn_status, NULL);
spin_unlock_bh(&xprt->sock_lock);
return;
}
spin_unlock_bh(&xprt_sock_lock);
spin_unlock_bh(&xprt->sock_lock);
}
defer:
spin_lock(&xprt_lock);
xprt->connecting = 0;
if (status < 0) {
rpc_delay(task, 5*HZ);
task->tk_status = -ENOTCONN;
}
rpc_wake_up(&xprt->reconn);
spin_unlock(&xprt_lock);
out_write:
xprt_release_write(xprt, task);
}
/*
......@@ -499,10 +532,7 @@ xprt_reconn_status(struct rpc_task *task)
dprintk("RPC: %4d xprt_reconn_timeout %d\n",
task->tk_pid, task->tk_status);
spin_lock(&xprt_lock);
xprt->connecting = 0;
rpc_wake_up(&xprt->reconn);
spin_unlock(&xprt_lock);
xprt_release_write(xprt, task);
}
/*
......@@ -699,10 +729,6 @@ tcp_read_fraghdr(struct rpc_xprt *xprt)
struct iovec riov;
int want, result;
if (xprt->tcp_offset >= xprt->tcp_reclen + sizeof(xprt->tcp_recm)) {
xprt->tcp_offset = 0;
xprt->tcp_reclen = 0;
}
if (xprt->tcp_offset >= sizeof(xprt->tcp_recm))
goto done;
......@@ -718,10 +744,6 @@ tcp_read_fraghdr(struct rpc_xprt *xprt)
want -= result;
} while (want);
/* Is this another fragment in the last message */
if (!xprt->tcp_more)
xprt->tcp_copied = 0; /* No, so we're reading a new message */
/* Get the record length and mask out the last fragment bit */
xprt->tcp_reclen = ntohl(xprt->tcp_recm);
xprt->tcp_more = (xprt->tcp_reclen & 0x80000000) ? 0 : 1;
......@@ -843,14 +865,15 @@ tcp_input_record(struct rpc_xprt *xprt)
/* Read in a new fragment marker if necessary */
/* Can we ever really expect to get completely empty fragments? */
if ((result = tcp_read_fraghdr(xprt)) <= 0)
if ((result = tcp_read_fraghdr(xprt)) < 0)
return result;
avail = result;
/* Read in the xid if necessary */
if ((result = tcp_read_xid(xprt, avail)) <= 0)
if ((result = tcp_read_xid(xprt, avail)) < 0)
return result;
avail = result;
if (!(avail = result))
goto out_ok;
/* Find and lock the request corresponding to this xid */
req = xprt_lookup_rqst(xprt, xprt->tcp_xid);
......@@ -868,9 +891,14 @@ tcp_input_record(struct rpc_xprt *xprt)
if ((result = tcp_read_discard(xprt, avail)) < 0)
return result;
out_ok:
dprintk("RPC: tcp_input_record done (off %d reclen %d copied %d)\n",
xprt->tcp_offset, xprt->tcp_reclen, xprt->tcp_copied);
result = xprt->tcp_reclen;
xprt->tcp_reclen = 0;
xprt->tcp_offset = 0;
if (!xprt->tcp_more)
xprt->tcp_copied = 0;
return result;
}
......@@ -885,11 +913,19 @@ void tcp_rpciod_queue(void)
rpciod_wake_up();
}
int xprt_tcp_pending(void)
{
int retval;
spin_lock_bh(&rpc_queue_lock);
retval = !list_empty(&rpc_xprt_pending);
spin_unlock_bh(&rpc_queue_lock);
return retval;
}
static inline
void xprt_append_pending(struct rpc_xprt *xprt)
{
if (!list_empty(&xprt->rx_pending))
return;
spin_lock_bh(&rpc_queue_lock);
if (list_empty(&xprt->rx_pending)) {
list_add(&xprt->rx_pending, rpc_xprt_pending.prev);
......@@ -1003,11 +1039,10 @@ tcp_state_change(struct sock *sk)
case TCP_ESTABLISHED:
if (xprt_test_and_set_connected(xprt))
break;
spin_lock_bh(&xprt_sock_lock);
spin_lock(&xprt->sock_lock);
if (xprt->snd_task && xprt->snd_task->tk_rpcwait == &xprt->sending)
rpc_wake_up_task(xprt->snd_task);
rpc_wake_up(&xprt->reconn);
spin_unlock_bh(&xprt_sock_lock);
spin_unlock(&xprt->sock_lock);
break;
case TCP_SYN_SENT:
case TCP_SYN_RECV:
......@@ -1041,10 +1076,10 @@ tcp_write_space(struct sock *sk)
return;
if (!xprt_test_and_set_wspace(xprt)) {
spin_lock_bh(&xprt_sock_lock);
spin_lock(&xprt->sock_lock);
if (xprt->snd_task && xprt->snd_task->tk_rpcwait == &xprt->sending)
rpc_wake_up_task(xprt->snd_task);
spin_unlock_bh(&xprt_sock_lock);
spin_unlock(&xprt->sock_lock);
}
if (test_bit(SOCK_NOSPACE, &sock->flags)) {
......@@ -1071,10 +1106,10 @@ udp_write_space(struct sock *sk)
return;
if (!xprt_test_and_set_wspace(xprt)) {
spin_lock_bh(&xprt_sock_lock);
spin_lock(&xprt->sock_lock);
if (xprt->snd_task && xprt->snd_task->tk_rpcwait == &xprt->sending)
rpc_wake_up_task(xprt->snd_task);
spin_unlock_bh(&xprt_sock_lock);
spin_unlock(&xprt->sock_lock);
}
if (sk->sleep && waitqueue_active(sk->sleep))
......@@ -1100,55 +1135,6 @@ xprt_timer(struct rpc_task *task)
rpc_wake_up_task(task);
}
/*
* Serialize access to sockets, in order to prevent different
* requests from interfering with each other.
*/
static int
xprt_down_transmit(struct rpc_task *task)
{
struct rpc_xprt *xprt = task->tk_rqstp->rq_xprt;
struct rpc_rqst *req = task->tk_rqstp;
spin_lock_bh(&xprt_sock_lock);
spin_lock(&xprt_lock);
if (xprt->snd_task && xprt->snd_task != task) {
dprintk("RPC: %4d TCP write queue full (task %d)\n",
task->tk_pid, xprt->snd_task->tk_pid);
task->tk_timeout = 0;
task->tk_status = -EAGAIN;
rpc_sleep_on(&xprt->sending, task, NULL, NULL);
} else if (!xprt->snd_task) {
xprt->snd_task = task;
#ifdef RPC_PROFILE
req->rq_xtime = jiffies;
#endif
req->rq_bytes_sent = 0;
}
spin_unlock(&xprt_lock);
spin_unlock_bh(&xprt_sock_lock);
return xprt->snd_task == task;
}
/*
* Releases the socket for use by other requests.
*/
static inline void
xprt_up_transmit(struct rpc_task *task)
{
struct rpc_xprt *xprt = task->tk_rqstp->rq_xprt;
if (xprt->snd_task && xprt->snd_task == task) {
spin_lock_bh(&xprt_sock_lock);
spin_lock(&xprt_lock);
xprt->snd_task = NULL;
rpc_wake_up_next(&xprt->sending);
spin_unlock(&xprt_lock);
spin_unlock_bh(&xprt_sock_lock);
}
}
/*
* Place the actual RPC call.
* We have to copy the iovec because sendmsg fiddles with its contents.
......@@ -1182,9 +1168,12 @@ xprt_transmit(struct rpc_task *task)
*marker = htonl(0x80000000|(req->rq_slen-sizeof(*marker)));
}
if (!xprt_down_transmit(task))
if (!xprt_lock_write(xprt, task))
return;
#ifdef RPC_PROFILE
req->rq_xtime = jiffies;
#endif
do_xprt_transmit(task);
}
......@@ -1252,12 +1241,12 @@ do_xprt_transmit(struct rpc_task *task)
switch (status) {
case -ENOMEM:
/* Protect against (udp|tcp)_write_space */
spin_lock_bh(&xprt_sock_lock);
spin_lock_bh(&xprt->sock_lock);
if (!xprt_wspace(xprt)) {
task->tk_timeout = req->rq_timeout.to_current;
rpc_sleep_on(&xprt->sending, task, NULL, NULL);
}
spin_unlock_bh(&xprt_sock_lock);
spin_unlock_bh(&xprt->sock_lock);
return;
case -EAGAIN:
/* Keep holding the socket if it is blocked */
......@@ -1268,6 +1257,9 @@ do_xprt_transmit(struct rpc_task *task)
if (!xprt->stream)
return;
default:
if (xprt->stream)
xprt_disconnect(xprt);
req->rq_bytes_sent = 0;
goto out_release;
}
......@@ -1278,7 +1270,7 @@ do_xprt_transmit(struct rpc_task *task)
rpc_add_timer(task, xprt_timer);
rpc_unlock_task(task);
out_release:
xprt_up_transmit(task);
xprt_release_write(xprt, task);
}
/*
......@@ -1313,7 +1305,7 @@ xprt_reserve(struct rpc_task *task)
dprintk("RPC: %4d xprt_reserve cong = %ld cwnd = %ld\n",
task->tk_pid, xprt->cong, xprt->cwnd);
spin_lock_bh(&xprt_sock_lock);
spin_lock_bh(&xprt->xprt_lock);
xprt_reserve_status(task);
if (task->tk_rqstp) {
task->tk_timeout = 0;
......@@ -1324,7 +1316,7 @@ xprt_reserve(struct rpc_task *task)
task->tk_status = -EAGAIN;
rpc_sleep_on(&xprt->backlog, task, NULL, NULL);
}
spin_unlock_bh(&xprt_sock_lock);
spin_unlock_bh(&xprt->xprt_lock);
dprintk("RPC: %4d xprt_reserve returns %d\n",
task->tk_pid, task->tk_status);
return task->tk_status;
......@@ -1397,7 +1389,11 @@ xprt_release(struct rpc_task *task)
struct rpc_xprt *xprt = task->tk_xprt;
struct rpc_rqst *req;
xprt_up_transmit(task);
if (xprt->snd_task == task) {
if (xprt->stream)
xprt_disconnect(xprt);
xprt_release_write(xprt, task);
}
if (!(req = task->tk_rqstp))
return;
task->tk_rqstp = NULL;
......@@ -1411,7 +1407,7 @@ xprt_release(struct rpc_task *task)
rpc_remove_wait_queue(task);
}
spin_lock_bh(&xprt_sock_lock);
spin_lock_bh(&xprt->xprt_lock);
req->rq_next = xprt->free;
xprt->free = req;
......@@ -1419,7 +1415,7 @@ xprt_release(struct rpc_task *task)
xprt->cong -= RPC_CWNDSCALE;
xprt_clear_backlog(xprt);
spin_unlock_bh(&xprt_sock_lock);
spin_unlock_bh(&xprt->xprt_lock);
}
/*
......@@ -1476,6 +1472,8 @@ xprt_setup(struct socket *sock, int proto,
} else
xprt->cwnd = RPC_INITCWND;
xprt->congtime = jiffies;
spin_lock_init(&xprt->sock_lock);
spin_lock_init(&xprt->xprt_lock);
init_waitqueue_head(&xprt->cong_wait);
/* Set timeout parameters */
......@@ -1489,7 +1487,6 @@ xprt_setup(struct socket *sock, int proto,
xprt->pending = RPC_INIT_WAITQ("xprt_pending");
xprt->sending = RPC_INIT_WAITQ("xprt_sending");
xprt->backlog = RPC_INIT_WAITQ("xprt_backlog");
xprt->reconn = RPC_INIT_WAITQ("xprt_reconn");
/* initialize free list */
for (i = 0, req = xprt->slot; i < RPC_MAXREQS-1; i++, req++)
......@@ -1625,7 +1622,6 @@ xprt_shutdown(struct rpc_xprt *xprt)
rpc_wake_up(&xprt->sending);
rpc_wake_up(&xprt->pending);
rpc_wake_up(&xprt->backlog);
rpc_wake_up(&xprt->reconn);
if (waitqueue_active(&xprt->cong_wait))
wake_up(&xprt->cong_wait);
}
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment