Commit a6385625 authored by Linus Torvalds's avatar Linus Torvalds

Merge branch 'for-2.6.30' of git://linux-nfs.org/~bfields/linux

* 'for-2.6.30' of git://linux-nfs.org/~bfields/linux: (81 commits)
  nfsd41: define nfsd4_set_statp as noop for !CONFIG_NFSD_V4
  nfsd41: define NFSD_DRC_SIZE_SHIFT in set_max_drc
  nfsd41: Documentation/filesystems/nfs41-server.txt
  nfsd41: CREATE_EXCLUSIVE4_1
  nfsd41: SUPPATTR_EXCLCREAT attribute
  nfsd41: support for 3-word long attribute bitmask
  nfsd: dynamically skip encoded fattr bitmap in _nfsd4_verify
  nfsd41: pass writable attrs mask to nfsd4_decode_fattr
  nfsd41: provide support for minor version 1 at rpc level
  nfsd41: control nfsv4.1 svc via /proc/fs/nfsd/versions
  nfsd41: add OPEN4_SHARE_ACCESS_WANT nfs4_stateid bmap
  nfsd41: access_valid
  nfsd41: clientid handling
  nfsd41: check encode size for sessions maxresponse cached
  nfsd41: stateid handling
  nfsd: pass nfsd4_compound_state* to nfs4_preprocess_{state,seq}id_op
  nfsd41: destroy_session operation
  nfsd41: non-page DRC for solo sequence responses
  nfsd41: Add a create session replay cache
  nfsd41: create_session operation
  ...
parents b24241a0 04826f43
Kernel NFS Server Statistics
============================
This document describes the format and semantics of the statistics
which the kernel NFS server makes available to userspace. These
statistics are available in several text form pseudo files, each of
which is described separately below.
In most cases you don't need to know these formats, as the nfsstat(8)
program from the nfs-utils distribution provides a helpful command-line
interface for extracting and printing them.
All the files described here are formatted as a sequence of text lines,
separated by newline '\n' characters. Lines beginning with a hash
'#' character are comments intended for humans and should be ignored
by parsing routines. All other lines contain a sequence of fields
separated by whitespace.
/proc/fs/nfsd/pool_stats
------------------------
This file is available in kernels from 2.6.30 onwards, if the
/proc/fs/nfsd filesystem is mounted (it almost always should be).
The first line is a comment which describes the fields present in
all the other lines. The other lines present the following data as
a sequence of unsigned decimal numeric fields. One line is shown
for each NFS thread pool.
All counters are 64 bits wide and wrap naturally. There is no way
to zero these counters, instead applications should do their own
rate conversion.
pool
The id number of the NFS thread pool to which this line applies.
This number does not change.
Thread pool ids are a contiguous set of small integers starting
at zero. The maximum value depends on the thread pool mode, but
currently cannot be larger than the number of CPUs in the system.
Note that in the default case there will be a single thread pool
which contains all the nfsd threads and all the CPUs in the system,
and thus this file will have a single line with a pool id of "0".
packets-arrived
Counts how many NFS packets have arrived. More precisely, this
is the number of times that the network stack has notified the
sunrpc server layer that new data may be available on a transport
(e.g. an NFS or UDP socket or an NFS/RDMA endpoint).
Depending on the NFS workload patterns and various network stack
effects (such as Large Receive Offload) which can combine packets
on the wire, this may be either more or less than the number
of NFS calls received (which statistic is available elsewhere).
However this is a more accurate and less workload-dependent measure
of how much CPU load is being placed on the sunrpc server layer
due to NFS network traffic.
sockets-enqueued
Counts how many times an NFS transport is enqueued to wait for
an nfsd thread to service it, i.e. no nfsd thread was considered
available.
The circumstance this statistic tracks indicates that there was NFS
network-facing work to be done but it couldn't be done immediately,
thus introducing a small delay in servicing NFS calls. The ideal
rate of change for this counter is zero; significantly non-zero
values may indicate a performance limitation.
This can happen either because there are too few nfsd threads in the
thread pool for the NFS workload (the workload is thread-limited),
or because the NFS workload needs more CPU time than is available in
the thread pool (the workload is CPU-limited). In the former case,
configuring more nfsd threads will probably improve the performance
of the NFS workload. In the latter case, the sunrpc server layer is
already choosing not to wake idle nfsd threads because there are too
many nfsd threads which want to run but cannot, so configuring more
nfsd threads will make no difference whatsoever. The overloads-avoided
statistic (see below) can be used to distinguish these cases.
threads-woken
Counts how many times an idle nfsd thread is woken to try to
receive some data from an NFS transport.
This statistic tracks the circumstance where incoming
network-facing NFS work is being handled quickly, which is a good
thing. The ideal rate of change for this counter will be close
to but less than the rate of change of the packets-arrived counter.
overloads-avoided
Counts how many times the sunrpc server layer chose not to wake an
nfsd thread, despite the presence of idle nfsd threads, because
too many nfsd threads had been recently woken but could not get
enough CPU time to actually run.
This statistic counts a circumstance where the sunrpc layer
heuristically avoids overloading the CPU scheduler with too many
runnable nfsd threads. The ideal rate of change for this counter
is zero. Significant non-zero values indicate that the workload
is CPU limited. Usually this is associated with heavy CPU usage
on all the CPUs in the nfsd thread pool.
If a sustained large overloads-avoided rate is detected on a pool,
the top(1) utility should be used to check for the following
pattern of CPU usage on all the CPUs associated with the given
nfsd thread pool.
- %us ~= 0 (as you're *NOT* running applications on your NFS server)
- %wa ~= 0
- %id ~= 0
- %sy + %hi + %si ~= 100
If this pattern is seen, configuring more nfsd threads will *not*
improve the performance of the workload. If this patten is not
seen, then something more subtle is wrong.
threads-timedout
Counts how many times an nfsd thread triggered an idle timeout,
i.e. was not woken to handle any incoming network packets for
some time.
This statistic counts a circumstance where there are more nfsd
threads configured than can be used by the NFS workload. This is
a clue that the number of nfsd threads can be reduced without
affecting performance. Unfortunately, it's only a clue and not
a strong indication, for a couple of reasons:
- Currently the rate at which the counter is incremented is quite
slow; the idle timeout is 60 minutes. Unless the NFS workload
remains constant for hours at a time, this counter is unlikely
to be providing information that is still useful.
- It is usually a wise policy to provide some slack,
i.e. configure a few more nfsds than are currently needed,
to allow for future spikes in load.
Note that incoming packets on NFS transports will be dealt with in
one of three ways. An nfsd thread can be woken (threads-woken counts
this case), or the transport can be enqueued for later attention
(sockets-enqueued counts this case), or the packet can be temporarily
deferred because the transport is currently being used by an nfsd
thread. This last case is not very interesting and is not explicitly
counted, but can be inferred from the other counters thus:
packets-deferred = packets-arrived - ( sockets-enqueued + threads-woken )
More
----
Descriptions of the other statistics file should go here.
Greg Banks <gnb@sgi.com>
26 Mar 2009
NFSv4.1 Server Implementation
Server support for minorversion 1 can be controlled using the
/proc/fs/nfsd/versions control file. The string output returned
by reading this file will contain either "+4.1" or "-4.1"
correspondingly.
Currently, server support for minorversion 1 is disabled by default.
It can be enabled at run time by writing the string "+4.1" to
the /proc/fs/nfsd/versions control file. Note that to write this
control file, the nfsd service must be taken down. Use your user-mode
nfs-utils to set this up; see rpc.nfsd(8)
The NFSv4 minorversion 1 (NFSv4.1) implementation in nfsd is based
on the latest NFSv4.1 Internet Draft:
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-29
From the many new features in NFSv4.1 the current implementation
focuses on the mandatory-to-implement NFSv4.1 Sessions, providing
"exactly once" semantics and better control and throttling of the
resources allocated for each client.
Other NFSv4.1 features, Parallel NFS operations in particular,
are still under development out of tree.
See http://wiki.linux-nfs.org/wiki/index.php/PNFS_prototype_design
for more information.
The table below, taken from the NFSv4.1 document, lists
the operations that are mandatory to implement (REQ), optional
(OPT), and NFSv4.0 operations that are required not to implement (MNI)
in minor version 1. The first column indicates the operations that
are not supported yet by the linux server implementation.
The OPTIONAL features identified and their abbreviations are as follows:
pNFS Parallel NFS
FDELG File Delegations
DDELG Directory Delegations
The following abbreviations indicate the linux server implementation status.
I Implemented NFSv4.1 operations.
NS Not Supported.
NS* unimplemented optional feature.
P pNFS features implemented out of tree.
PNS pNFS features that are not supported yet (out of tree).
Operations
+----------------------+------------+--------------+----------------+
| Operation | REQ, REC, | Feature | Definition |
| | OPT, or | (REQ, REC, | |
| | MNI | or OPT) | |
+----------------------+------------+--------------+----------------+
| ACCESS | REQ | | Section 18.1 |
NS | BACKCHANNEL_CTL | REQ | | Section 18.33 |
NS | BIND_CONN_TO_SESSION | REQ | | Section 18.34 |
| CLOSE | REQ | | Section 18.2 |
| COMMIT | REQ | | Section 18.3 |
| CREATE | REQ | | Section 18.4 |
I | CREATE_SESSION | REQ | | Section 18.36 |
NS*| DELEGPURGE | OPT | FDELG (REQ) | Section 18.5 |
| DELEGRETURN | OPT | FDELG, | Section 18.6 |
| | | DDELG, pNFS | |
| | | (REQ) | |
NS | DESTROY_CLIENTID | REQ | | Section 18.50 |
I | DESTROY_SESSION | REQ | | Section 18.37 |
I | EXCHANGE_ID | REQ | | Section 18.35 |
NS | FREE_STATEID | REQ | | Section 18.38 |
| GETATTR | REQ | | Section 18.7 |
P | GETDEVICEINFO | OPT | pNFS (REQ) | Section 18.40 |
P | GETDEVICELIST | OPT | pNFS (OPT) | Section 18.41 |
| GETFH | REQ | | Section 18.8 |
NS*| GET_DIR_DELEGATION | OPT | DDELG (REQ) | Section 18.39 |
P | LAYOUTCOMMIT | OPT | pNFS (REQ) | Section 18.42 |
P | LAYOUTGET | OPT | pNFS (REQ) | Section 18.43 |
P | LAYOUTRETURN | OPT | pNFS (REQ) | Section 18.44 |
| LINK | OPT | | Section 18.9 |
| LOCK | REQ | | Section 18.10 |
| LOCKT | REQ | | Section 18.11 |
| LOCKU | REQ | | Section 18.12 |
| LOOKUP | REQ | | Section 18.13 |
| LOOKUPP | REQ | | Section 18.14 |
| NVERIFY | REQ | | Section 18.15 |
| OPEN | REQ | | Section 18.16 |
NS*| OPENATTR | OPT | | Section 18.17 |
| OPEN_CONFIRM | MNI | | N/A |
| OPEN_DOWNGRADE | REQ | | Section 18.18 |
| PUTFH | REQ | | Section 18.19 |
| PUTPUBFH | REQ | | Section 18.20 |
| PUTROOTFH | REQ | | Section 18.21 |
| READ | REQ | | Section 18.22 |
| READDIR | REQ | | Section 18.23 |
| READLINK | OPT | | Section 18.24 |
NS | RECLAIM_COMPLETE | REQ | | Section 18.51 |
| RELEASE_LOCKOWNER | MNI | | N/A |
| REMOVE | REQ | | Section 18.25 |
| RENAME | REQ | | Section 18.26 |
| RENEW | MNI | | N/A |
| RESTOREFH | REQ | | Section 18.27 |
| SAVEFH | REQ | | Section 18.28 |
| SECINFO | REQ | | Section 18.29 |
NS | SECINFO_NO_NAME | REC | pNFS files | Section 18.45, |
| | | layout (REQ) | Section 13.12 |
I | SEQUENCE | REQ | | Section 18.46 |
| SETATTR | REQ | | Section 18.30 |
| SETCLIENTID | MNI | | N/A |
| SETCLIENTID_CONFIRM | MNI | | N/A |
NS | SET_SSV | REQ | | Section 18.47 |
NS | TEST_STATEID | REQ | | Section 18.48 |
| VERIFY | REQ | | Section 18.31 |
NS*| WANT_DELEGATION | OPT | FDELG (OPT) | Section 18.49 |
| WRITE | REQ | | Section 18.32 |
Callback Operations
+-------------------------+-----------+-------------+---------------+
| Operation | REQ, REC, | Feature | Definition |
| | OPT, or | (REQ, REC, | |
| | MNI | or OPT) | |
+-------------------------+-----------+-------------+---------------+
| CB_GETATTR | OPT | FDELG (REQ) | Section 20.1 |
P | CB_LAYOUTRECALL | OPT | pNFS (REQ) | Section 20.3 |
NS*| CB_NOTIFY | OPT | DDELG (REQ) | Section 20.4 |
P | CB_NOTIFY_DEVICEID | OPT | pNFS (OPT) | Section 20.12 |
NS*| CB_NOTIFY_LOCK | OPT | | Section 20.11 |
NS*| CB_PUSH_DELEG | OPT | FDELG (OPT) | Section 20.5 |
| CB_RECALL | OPT | FDELG, | Section 20.2 |
| | | DDELG, pNFS | |
| | | (REQ) | |
NS*| CB_RECALL_ANY | OPT | FDELG, | Section 20.6 |
| | | DDELG, pNFS | |
| | | (REQ) | |
NS | CB_RECALL_SLOT | REQ | | Section 20.8 |
NS*| CB_RECALLABLE_OBJ_AVAIL | OPT | DDELG, pNFS | Section 20.7 |
| | | (REQ) | |
I | CB_SEQUENCE | OPT | FDELG, | Section 20.9 |
| | | DDELG, pNFS | |
| | | (REQ) | |
NS*| CB_WANTS_CANCELLED | OPT | FDELG, | Section 20.10 |
| | | DDELG, pNFS | |
| | | (REQ) | |
+-------------------------+-----------+-------------+---------------+
Implementation notes:
EXCHANGE_ID:
* only SP4_NONE state protection supported
* implementation ids are ignored
CREATE_SESSION:
* backchannel attributes are ignored
* backchannel security parameters are ignored
SEQUENCE:
* no support for dynamic slot table renegotiation (optional)
nfsv4.1 COMPOUND rules:
The following cases aren't supported yet:
* Enforcing of NFS4ERR_NOT_ONLY_OP for: BIND_CONN_TO_SESSION, CREATE_SESSION,
DESTROY_CLIENTID, DESTROY_SESSION, EXCHANGE_ID.
* DESTROY_SESSION MUST be the final operation in the COMPOUND request.
......@@ -426,8 +426,15 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
ret = nlm_granted;
goto out;
case -EAGAIN:
ret = nlm_lck_denied;
/*
* If this is a blocking request for an
* already pending lock request then we need
* to put it back on lockd's block list
*/
if (wait)
break;
ret = nlm_lck_denied;
goto out;
case FILE_LOCK_DEFERRED:
if (wait)
break;
......@@ -443,10 +450,6 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
goto out;
}
ret = nlm_lck_denied;
if (!wait)
goto out;
ret = nlm_lck_blocked;
/* Append to list of blocked */
......
config NFSD
tristate "NFS server support"
depends on INET
depends on FILE_LOCKING
select LOCKD
select SUNRPC
select EXPORTFS
......
......@@ -18,6 +18,7 @@
#include <linux/unistd.h>
#include <linux/slab.h>
#include <linux/major.h>
#include <linux/magic.h>
#include <linux/sunrpc/svc.h>
#include <linux/nfsd/nfsd.h>
......@@ -202,6 +203,7 @@ nfsd3_proc_write(struct svc_rqst *rqstp, struct nfsd3_writeargs *argp,
struct nfsd3_writeres *resp)
{
__be32 nfserr;
unsigned long cnt = argp->len;
dprintk("nfsd: WRITE(3) %s %d bytes at %ld%s\n",
SVCFH_fmt(&argp->fh),
......@@ -214,9 +216,9 @@ nfsd3_proc_write(struct svc_rqst *rqstp, struct nfsd3_writeargs *argp,
nfserr = nfsd_write(rqstp, &resp->fh, NULL,
argp->offset,
rqstp->rq_vec, argp->vlen,
argp->len,
&cnt,
&resp->committed);
resp->count = argp->count;
resp->count = cnt;
RETURN_STATUS(nfserr);
}
......@@ -569,7 +571,7 @@ nfsd3_proc_fsinfo(struct svc_rqst * rqstp, struct nfsd_fhandle *argp,
struct super_block *sb = argp->fh.fh_dentry->d_inode->i_sb;
/* Note that we don't care for remote fs's here */
if (sb->s_magic == 0x4d44 /* MSDOS_SUPER_MAGIC */) {
if (sb->s_magic == MSDOS_SUPER_MAGIC) {
resp->f_properties = NFS3_FSF_BILLYBOY;
}
resp->f_maxfilesize = sb->s_maxbytes;
......@@ -610,7 +612,7 @@ nfsd3_proc_pathconf(struct svc_rqst * rqstp, struct nfsd_fhandle *argp,
resp->p_link_max = EXT2_LINK_MAX;
resp->p_name_max = EXT2_NAME_LEN;
break;
case 0x4d44: /* MSDOS_SUPER_MAGIC */
case MSDOS_SUPER_MAGIC:
resp->p_case_insensitive = 1;
resp->p_case_preserving = 0;
break;
......
......@@ -218,7 +218,7 @@ static int
encode_cb_recall(struct xdr_stream *xdr, struct nfs4_cb_recall *cb_rec)
{
__be32 *p;
int len = cb_rec->cbr_fhlen;
int len = cb_rec->cbr_fh.fh_size;
RESERVE_SPACE(12+sizeof(cb_rec->cbr_stateid) + len);
WRITE32(OP_CB_RECALL);
......@@ -226,7 +226,7 @@ encode_cb_recall(struct xdr_stream *xdr, struct nfs4_cb_recall *cb_rec)
WRITEMEM(&cb_rec->cbr_stateid.si_opaque, sizeof(stateid_opaque_t));
WRITE32(cb_rec->cbr_trunc);
WRITE32(len);
WRITEMEM(cb_rec->cbr_fhval, len);
WRITEMEM(&cb_rec->cbr_fh.fh_base, len);
return 0;
}
......@@ -361,9 +361,8 @@ static struct rpc_program cb_program = {
/* Reference counting, callback cleanup, etc., all look racy as heck.
* And why is cb_set an atomic? */
static int do_probe_callback(void *data)
static struct rpc_clnt *setup_callback_client(struct nfs4_client *clp)
{
struct nfs4_client *clp = data;
struct sockaddr_in addr;
struct nfs4_callback *cb = &clp->cl_callback;
struct rpc_timeout timeparms = {
......@@ -384,17 +383,10 @@ static int do_probe_callback(void *data)
.flags = (RPC_CLNT_CREATE_NOPING | RPC_CLNT_CREATE_QUIET),
.client_name = clp->cl_principal,
};
struct rpc_message msg = {
.rpc_proc = &nfs4_cb_procedures[NFSPROC4_CLNT_CB_NULL],
.rpc_argp = clp,
};
struct rpc_clnt *client;
int status;
if (!clp->cl_principal && (clp->cl_flavor >= RPC_AUTH_GSS_KRB5)) {
status = nfserr_cb_path_down;
goto out_err;
}
if (!clp->cl_principal && (clp->cl_flavor >= RPC_AUTH_GSS_KRB5))
return ERR_PTR(-EINVAL);
/* Initialize address */
memset(&addr, 0, sizeof(addr));
......@@ -404,9 +396,29 @@ static int do_probe_callback(void *data)
/* Create RPC client */
client = rpc_create(&args);
if (IS_ERR(client))
dprintk("NFSD: couldn't create callback client: %ld\n",
PTR_ERR(client));
return client;
}
static int do_probe_callback(void *data)
{
struct nfs4_client *clp = data;
struct nfs4_callback *cb = &clp->cl_callback;
struct rpc_message msg = {
.rpc_proc = &nfs4_cb_procedures[NFSPROC4_CLNT_CB_NULL],
.rpc_argp = clp,
};
struct rpc_clnt *client;
int status;
client = setup_callback_client(clp);
if (IS_ERR(client)) {
dprintk("NFSD: couldn't create callback client\n");
status = PTR_ERR(client);
dprintk("NFSD: couldn't create callback client: %d\n",
status);
goto out_err;
}
......@@ -422,10 +434,10 @@ static int do_probe_callback(void *data)
out_release_client:
rpc_shutdown_client(client);
out_err:
dprintk("NFSD: warning: no callback path to client %.*s\n",
(int)clp->cl_name.len, clp->cl_name.data);
dprintk("NFSD: warning: no callback path to client %.*s: error %d\n",
(int)clp->cl_name.len, clp->cl_name.data, status);
put_nfs4_client(clp);
return status;
return 0;
}
/*
......@@ -451,7 +463,6 @@ nfsd4_probe_callback(struct nfs4_client *clp)
/*
* called with dp->dl_count inc'ed.
* nfs4_lock_state() may or may not have been called.
*/
void
nfsd4_cb_recall(struct nfs4_delegation *dp)
......
This diff is collapsed.
......@@ -182,36 +182,26 @@ nfsd4_create_clid_dir(struct nfs4_client *clp)
typedef int (recdir_func)(struct dentry *, struct dentry *);
struct dentry_list {
struct dentry *dentry;
struct name_list {
char name[HEXDIR_LEN];
struct list_head list;
};
struct dentry_list_arg {
struct list_head dentries;
struct dentry *parent;
};
static int
nfsd4_build_dentrylist(void *arg, const char *name, int namlen,
nfsd4_build_namelist(void *arg, const char *name, int namlen,
loff_t offset, u64 ino, unsigned int d_type)
{
struct dentry_list_arg *dla = arg;
struct list_head *dentries = &dla->dentries;
struct dentry *parent = dla->parent;
struct dentry *dentry;
struct dentry_list *child;
struct list_head *names = arg;
struct name_list *entry;
if (name && isdotent(name, namlen))
if (namlen != HEXDIR_LEN - 1)
return 0;
dentry = lookup_one_len(name, parent, namlen);
if (IS_ERR(dentry))
return PTR_ERR(dentry);
child = kmalloc(sizeof(*child), GFP_KERNEL);
if (child == NULL)
entry = kmalloc(sizeof(struct name_list), GFP_KERNEL);
if (entry == NULL)
return -ENOMEM;
child->dentry = dentry;
list_add(&child->list, dentries);
memcpy(entry->name, name, HEXDIR_LEN - 1);
entry->name[HEXDIR_LEN - 1] = '\0';
list_add(&entry->list, names);
return 0;
}
......@@ -220,11 +210,9 @@ nfsd4_list_rec_dir(struct dentry *dir, recdir_func *f)
{
const struct cred *original_cred;
struct file *filp;
struct dentry_list_arg dla = {
.parent = dir,
};
struct list_head *dentries = &dla.dentries;
struct dentry_list *child;
LIST_HEAD(names);
struct name_list *entry;
struct dentry *dentry;
int status;
if (!rec_dir_init)
......@@ -233,31 +221,34 @@ nfsd4_list_rec_dir(struct dentry *dir, recdir_func *f)
status = nfs4_save_creds(&original_cred);
if (status < 0)
return status;
INIT_LIST_HEAD(dentries);
filp = dentry_open(dget(dir), mntget(rec_dir.mnt), O_RDONLY,
current_cred());
status = PTR_ERR(filp);
if (IS_ERR(filp))
goto out;
INIT_LIST_HEAD(dentries);
status = vfs_readdir(filp, nfsd4_build_dentrylist, &dla);
status = vfs_readdir(filp, nfsd4_build_namelist, &names);
fput(filp);
while (!list_empty(dentries)) {
child = list_entry(dentries->next, struct dentry_list, list);
status = f(dir, child->dentry);
while (!list_empty(&names)) {
entry = list_entry(names.next, struct name_list, list);
dentry = lookup_one_len(entry->name, dir, HEXDIR_LEN-1);
if (IS_ERR(dentry)) {
status = PTR_ERR(dentry);
goto out;
}
status = f(dir, dentry);
dput(dentry);
if (status)
goto out;
list_del(&child->list);
dput(child->dentry);
kfree(child);
list_del(&entry->list);
kfree(entry);
}
out:
while (!list_empty(dentries)) {
child = list_entry(dentries->next, struct dentry_list, list);
list_del(&child->list);
dput(child->dentry);
kfree(child);
while (!list_empty(&names)) {
entry = list_entry(names.next, struct name_list, list);
list_del(&entry->list);
kfree(entry);
}
nfs4_reset_creds(original_cred);
return status;
......@@ -353,7 +344,8 @@ purge_old(struct dentry *parent, struct dentry *child)
{
int status;
if (nfs4_has_reclaimed_state(child->d_name.name))
/* note: we currently use this path only for minorversion 0 */
if (nfs4_has_reclaimed_state(child->d_name.name, false))
return 0;
status = nfsd4_clear_clid_dir(parent, child);
......
This diff is collapsed.
This diff is collapsed.
......@@ -60,6 +60,7 @@ enum {
NFSD_FO_UnlockFS,
NFSD_Threads,
NFSD_Pool_Threads,
NFSD_Pool_Stats,
NFSD_Versions,
NFSD_Ports,
NFSD_MaxBlkSize,
......@@ -172,6 +173,16 @@ static const struct file_operations exports_operations = {
.owner = THIS_MODULE,
};
extern int nfsd_pool_stats_open(struct inode *inode, struct file *file);
static struct file_operations pool_stats_operations = {
.open = nfsd_pool_stats_open,
.read = seq_read,
.llseek = seq_lseek,
.release = seq_release,
.owner = THIS_MODULE,
};
/*----------------------------------------------------------------------------*/
/*
* payload - write methods
......@@ -781,8 +792,9 @@ static ssize_t write_pool_threads(struct file *file, char *buf, size_t size)
static ssize_t __write_versions(struct file *file, char *buf, size_t size)
{
char *mesg = buf;
char *vers, sign;
char *vers, *minorp, sign;
int len, num;
unsigned minor;
ssize_t tlen = 0;
char *sep;
......@@ -803,9 +815,20 @@ static ssize_t __write_versions(struct file *file, char *buf, size_t size)
do {
sign = *vers;
if (sign == '+' || sign == '-')
num = simple_strtol((vers+1), NULL, 0);
num = simple_strtol((vers+1), &minorp, 0);
else
num = simple_strtol(vers, NULL, 0);
num = simple_strtol(vers, &minorp, 0);
if (*minorp == '.') {
if (num < 4)
return -EINVAL;
minor = simple_strtoul(minorp+1, NULL, 0);
if (minor == 0)
return -EINVAL;
if (nfsd_minorversion(minor, sign == '-' ?
NFSD_CLEAR : NFSD_SET) < 0)
return -EINVAL;
goto next;
}
switch(num) {
case 2:
case 3:
......@@ -815,6 +838,7 @@ static ssize_t __write_versions(struct file *file, char *buf, size_t size)
default:
return -EINVAL;
}
next:
vers += len + 1;
tlen += len;
} while ((len = qword_get(&mesg, vers, size)) > 0);
......@@ -833,6 +857,13 @@ static ssize_t __write_versions(struct file *file, char *buf, size_t size)
num);
sep = " ";
}
if (nfsd_vers(4, NFSD_AVAIL))
for (minor = 1; minor <= NFSD_SUPPORTED_MINOR_VERSION; minor++)
len += sprintf(buf+len, " %c4.%u",
(nfsd_vers(4, NFSD_TEST) &&
nfsd_minorversion(minor, NFSD_TEST)) ?
'+' : '-',
minor);
len += sprintf(buf+len, "\n");
return len;
}
......@@ -1248,6 +1279,7 @@ static int nfsd_fill_super(struct super_block * sb, void * data, int silent)
[NFSD_Fh] = {"filehandle", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_Threads] = {"threads", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_Pool_Threads] = {"pool_threads", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_Pool_Stats] = {"pool_stats", &pool_stats_operations, S_IRUGO},
[NFSD_Versions] = {"versions", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_Ports] = {"portlist", &transaction_ops, S_IWUSR|S_IRUGO},
[NFSD_MaxBlkSize] = {"max_block_size", &transaction_ops, S_IWUSR|S_IRUGO},
......
......@@ -180,6 +180,7 @@ nfsd_proc_write(struct svc_rqst *rqstp, struct nfsd_writeargs *argp,
{
__be32 nfserr;
int stable = 1;
unsigned long cnt = argp->len;
dprintk("nfsd: WRITE %s %d bytes at %d\n",
SVCFH_fmt(&argp->fh),
......@@ -188,7 +189,7 @@ nfsd_proc_write(struct svc_rqst *rqstp, struct nfsd_writeargs *argp,
nfserr = nfsd_write(rqstp, fh_copy(&resp->fh, &argp->fh), NULL,
argp->offset,
rqstp->rq_vec, argp->vlen,
argp->len,
&cnt,
&stable);
return nfsd_return_attrs(nfserr, resp);
}
......
......@@ -22,6 +22,7 @@
#include <linux/freezer.h>
#include <linux/fs_struct.h>
#include <linux/kthread.h>
#include <linux/swap.h>
#include <linux/sunrpc/types.h>
#include <linux/sunrpc/stats.h>
......@@ -40,9 +41,6 @@
extern struct svc_program nfsd_program;
static int nfsd(void *vrqstp);
struct timeval nfssvc_boot;
static atomic_t nfsd_busy;
static unsigned long nfsd_last_call;
static DEFINE_SPINLOCK(nfsd_call_lock);
/*
* nfsd_mutex protects nfsd_serv -- both the pointer itself and the members
......@@ -123,6 +121,8 @@ struct svc_program nfsd_program = {
};
u32 nfsd_supported_minorversion;
int nfsd_vers(int vers, enum vers_op change)
{
if (vers < NFSD_MINVERS || vers >= NFSD_NRVERS)
......@@ -149,6 +149,28 @@ int nfsd_vers(int vers, enum vers_op change)
}
return 0;
}
int nfsd_minorversion(u32 minorversion, enum vers_op change)
{
if (minorversion > NFSD_SUPPORTED_MINOR_VERSION)
return -1;
switch(change) {
case NFSD_SET:
nfsd_supported_minorversion = minorversion;
break;
case NFSD_CLEAR:
if (minorversion == 0)
return -1;
nfsd_supported_minorversion = minorversion - 1;
break;
case NFSD_TEST:
return minorversion <= nfsd_supported_minorversion;
case NFSD_AVAIL:
return minorversion <= NFSD_SUPPORTED_MINOR_VERSION;
}
return 0;
}
/*
* Maximum number of nfsd processes
*/
......@@ -200,6 +222,28 @@ void nfsd_reset_versions(void)
}
}
/*
* Each session guarantees a negotiated per slot memory cache for replies
* which in turn consumes memory beyond the v2/v3/v4.0 server. A dedicated
* NFSv4.1 server might want to use more memory for a DRC than a machine
* with mutiple services.
*
* Impose a hard limit on the number of pages for the DRC which varies
* according to the machines free pages. This is of course only a default.
*
* For now this is a #defined shift which could be under admin control
* in the future.
*/
static void set_max_drc(void)
{
/* The percent of nr_free_buffer_pages used by the V4.1 server DRC */
#define NFSD_DRC_SIZE_SHIFT 7
nfsd_serv->sv_drc_max_pages = nr_free_buffer_pages()
>> NFSD_DRC_SIZE_SHIFT;
nfsd_serv->sv_drc_pages_used = 0;
dprintk("%s svc_drc_max_pages %u\n", __func__,
nfsd_serv->sv_drc_max_pages);
}
int nfsd_create_serv(void)
{
......@@ -227,11 +271,12 @@ int nfsd_create_serv(void)
nfsd_max_blksize /= 2;
}
atomic_set(&nfsd_busy, 0);
nfsd_serv = svc_create_pooled(&nfsd_program, nfsd_max_blksize,
nfsd_last_thread, nfsd, THIS_MODULE);
if (nfsd_serv == NULL)
err = -ENOMEM;
else
set_max_drc();
do_gettimeofday(&nfssvc_boot); /* record boot time */
return err;
......@@ -375,26 +420,6 @@ nfsd_svc(unsigned short port, int nrservs)
return error;
}
static inline void
update_thread_usage(int busy_threads)
{
unsigned long prev_call;
unsigned long diff;
int decile;
spin_lock(&nfsd_call_lock);
prev_call = nfsd_last_call;
nfsd_last_call = jiffies;
decile = busy_threads*10/nfsdstats.th_cnt;
if (decile>0 && decile <= 10) {
diff = nfsd_last_call - prev_call;
if ( (nfsdstats.th_usage[decile-1] += diff) >= NFSD_USAGE_WRAP)
nfsdstats.th_usage[decile-1] -= NFSD_USAGE_WRAP;
if (decile == 10)
nfsdstats.th_fullcnt++;
}
spin_unlock(&nfsd_call_lock);
}
/*
* This is the NFS server kernel thread
......@@ -460,8 +485,6 @@ nfsd(void *vrqstp)
continue;
}
update_thread_usage(atomic_read(&nfsd_busy));
atomic_inc(&nfsd_busy);
/* Lock the export hash tables for reading. */
exp_readlock();
......@@ -470,8 +493,6 @@ nfsd(void *vrqstp)
/* Unlock export hash tables */
exp_readunlock();
update_thread_usage(atomic_read(&nfsd_busy));
atomic_dec(&nfsd_busy);
}
/* Clear signals before calling svc_exit_thread() */
......@@ -539,6 +560,10 @@ nfsd_dispatch(struct svc_rqst *rqstp, __be32 *statp)
+ rqstp->rq_res.head[0].iov_len;
rqstp->rq_res.head[0].iov_len += sizeof(__be32);
/* NFSv4.1 DRC requires statp */
if (rqstp->rq_vers == 4)
nfsd4_set_statp(rqstp, statp);
/* Now call the procedure handler, and encode NFS status. */
nfserr = proc->pc_func(rqstp, rqstp->rq_argp, rqstp->rq_resp);
nfserr = map_new_errors(rqstp->rq_vers, nfserr);
......@@ -570,3 +595,10 @@ nfsd_dispatch(struct svc_rqst *rqstp, __be32 *statp)
nfsd_cache_update(rqstp, proc->pc_cachetype, statp + 1);
return 1;
}
int nfsd_pool_stats_open(struct inode *inode, struct file *file)
{
if (nfsd_serv == NULL)
return -ENODEV;
return svc_pool_stats_open(nfsd_serv, file);
}
......@@ -366,8 +366,9 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
}
/* Revoke setuid/setgid on chown */
if (((iap->ia_valid & ATTR_UID) && iap->ia_uid != inode->i_uid) ||
((iap->ia_valid & ATTR_GID) && iap->ia_gid != inode->i_gid)) {
if (!S_ISDIR(inode->i_mode) &&
(((iap->ia_valid & ATTR_UID) && iap->ia_uid != inode->i_uid) ||
((iap->ia_valid & ATTR_GID) && iap->ia_gid != inode->i_gid))) {
iap->ia_valid |= ATTR_KILL_PRIV;
if (iap->ia_valid & ATTR_MODE) {
/* we're setting mode too, just clear the s*id bits */
......@@ -960,7 +961,7 @@ static void kill_suid(struct dentry *dentry)
static __be32
nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
loff_t offset, struct kvec *vec, int vlen,
unsigned long cnt, int *stablep)
unsigned long *cnt, int *stablep)
{
struct svc_export *exp;
struct dentry *dentry;
......@@ -974,7 +975,7 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
err = nfserr_perm;
if ((fhp->fh_export->ex_flags & NFSEXP_MSNFS) &&
(!lock_may_write(file->f_path.dentry->d_inode, offset, cnt)))
(!lock_may_write(file->f_path.dentry->d_inode, offset, *cnt)))
goto out;
#endif
......@@ -1009,7 +1010,7 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
host_err = vfs_writev(file, (struct iovec __user *)vec, vlen, &offset);
set_fs(oldfs);
if (host_err >= 0) {
nfsdstats.io_write += cnt;
nfsdstats.io_write += host_err;
fsnotify_modify(file->f_path.dentry);
}
......@@ -1054,9 +1055,10 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
}
dprintk("nfsd: write complete host_err=%d\n", host_err);
if (host_err >= 0)
if (host_err >= 0) {
err = 0;
else
*cnt = host_err;
} else
err = nfserrno(host_err);
out:
return err;
......@@ -1098,7 +1100,7 @@ nfsd_read(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
*/
__be32
nfsd_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
loff_t offset, struct kvec *vec, int vlen, unsigned long cnt,
loff_t offset, struct kvec *vec, int vlen, unsigned long *cnt,
int *stablep)
{
__be32 err = 0;
......@@ -1179,6 +1181,21 @@ nfsd_create_setattr(struct svc_rqst *rqstp, struct svc_fh *resfhp,
return 0;
}
/* HPUX client sometimes creates a file in mode 000, and sets size to 0.
* setting size to 0 may fail for some specific file systems by the permission
* checking which requires WRITE permission but the mode is 000.
* we ignore the resizing(to 0) on the just new created file, since the size is
* 0 after file created.
*
* call this only after vfs_create() is called.
* */
static void
nfsd_check_ignore_resizing(struct iattr *iap)
{
if ((iap->ia_valid & ATTR_SIZE) && (iap->ia_size == 0))
iap->ia_valid &= ~ATTR_SIZE;
}
/*
* Create a file (regular, directory, device, fifo); UNIX sockets
* not yet implemented.
......@@ -1274,6 +1291,8 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
switch (type) {
case S_IFREG:
host_err = vfs_create(dirp, dchild, iap->ia_mode, NULL);
if (!host_err)
nfsd_check_ignore_resizing(iap);
break;
case S_IFDIR:
host_err = vfs_mkdir(dirp, dchild, iap->ia_mode);
......@@ -1427,6 +1446,8 @@ nfsd_create_v3(struct svc_rqst *rqstp, struct svc_fh *fhp,
/* setattr will sync the child (or not) */
}
nfsd_check_ignore_resizing(iap);
if (createmode == NFS3_CREATE_EXCLUSIVE) {
/* Cram the verifier into atime/mtime */
iap->ia_valid = ATTR_MTIME|ATTR_ATIME
......
......@@ -25,13 +25,13 @@ struct svc_rqst;
#define NLM_MAXCOOKIELEN 32
#define NLM_MAXSTRLEN 1024
#define nlm_granted __constant_htonl(NLM_LCK_GRANTED)
#define nlm_lck_denied __constant_htonl(NLM_LCK_DENIED)
#define nlm_lck_denied_nolocks __constant_htonl(NLM_LCK_DENIED_NOLOCKS)
#define nlm_lck_blocked __constant_htonl(NLM_LCK_BLOCKED)
#define nlm_lck_denied_grace_period __constant_htonl(NLM_LCK_DENIED_GRACE_PERIOD)
#define nlm_granted cpu_to_be32(NLM_LCK_GRANTED)
#define nlm_lck_denied cpu_to_be32(NLM_LCK_DENIED)
#define nlm_lck_denied_nolocks cpu_to_be32(NLM_LCK_DENIED_NOLOCKS)
#define nlm_lck_blocked cpu_to_be32(NLM_LCK_BLOCKED)
#define nlm_lck_denied_grace_period cpu_to_be32(NLM_LCK_DENIED_GRACE_PERIOD)
#define nlm_drop_reply __constant_htonl(30000)
#define nlm_drop_reply cpu_to_be32(30000)
/* Lock info passed via NLM */
struct nlm_lock {
......
......@@ -15,11 +15,11 @@
#include <linux/lockd/xdr.h>
/* error codes new to NLMv4 */
#define nlm4_deadlock __constant_htonl(NLM_DEADLCK)
#define nlm4_rofs __constant_htonl(NLM_ROFS)
#define nlm4_stale_fh __constant_htonl(NLM_STALE_FH)
#define nlm4_fbig __constant_htonl(NLM_FBIG)
#define nlm4_failed __constant_htonl(NLM_FAILED)
#define nlm4_deadlock cpu_to_be32(NLM_DEADLCK)
#define nlm4_rofs cpu_to_be32(NLM_ROFS)
#define nlm4_stale_fh cpu_to_be32(NLM_STALE_FH)
#define nlm4_fbig cpu_to_be32(NLM_FBIG)
#define nlm4_failed cpu_to_be32(NLM_FAILED)
......
......@@ -109,7 +109,6 @@
NFSERR_FILE_OPEN = 10046, /* v4 */
NFSERR_ADMIN_REVOKED = 10047, /* v4 */
NFSERR_CB_PATH_DOWN = 10048, /* v4 */
NFSERR_REPLAY_ME = 10049 /* v4 */
};
/* NFSv2 file types - beware, these are not the same in NFSv3 */
......
......@@ -21,6 +21,7 @@
#define NFS4_FHSIZE 128
#define NFS4_MAXPATHLEN PATH_MAX
#define NFS4_MAXNAMLEN NAME_MAX
#define NFS4_MAX_SESSIONID_LEN 16
#define NFS4_ACCESS_READ 0x0001
#define NFS4_ACCESS_LOOKUP 0x0002
......@@ -38,6 +39,7 @@
#define NFS4_OPEN_RESULT_CONFIRM 0x0002
#define NFS4_OPEN_RESULT_LOCKTYPE_POSIX 0x0004
#define NFS4_SHARE_ACCESS_MASK 0x000F
#define NFS4_SHARE_ACCESS_READ 0x0001
#define NFS4_SHARE_ACCESS_WRITE 0x0002
#define NFS4_SHARE_ACCESS_BOTH 0x0003
......@@ -45,6 +47,19 @@
#define NFS4_SHARE_DENY_WRITE 0x0002
#define NFS4_SHARE_DENY_BOTH 0x0003
/* nfs41 */
#define NFS4_SHARE_WANT_MASK 0xFF00
#define NFS4_SHARE_WANT_NO_PREFERENCE 0x0000
#define NFS4_SHARE_WANT_READ_DELEG 0x0100
#define NFS4_SHARE_WANT_WRITE_DELEG 0x0200
#define NFS4_SHARE_WANT_ANY_DELEG 0x0300
#define NFS4_SHARE_WANT_NO_DELEG 0x0400
#define NFS4_SHARE_WANT_CANCEL 0x0500
#define NFS4_SHARE_WHEN_MASK 0xF0000
#define NFS4_SHARE_SIGNAL_DELEG_WHEN_RESRC_AVAIL 0x10000
#define NFS4_SHARE_PUSH_DELEG_WHEN_UNCONTENDED 0x20000
#define NFS4_SET_TO_SERVER_TIME 0
#define NFS4_SET_TO_CLIENT_TIME 1
......@@ -88,6 +103,31 @@
#define NFS4_ACE_GENERIC_EXECUTE 0x001200A0
#define NFS4_ACE_MASK_ALL 0x001F01FF
#define EXCHGID4_FLAG_SUPP_MOVED_REFER 0x00000001
#define EXCHGID4_FLAG_SUPP_MOVED_MIGR 0x00000002
#define EXCHGID4_FLAG_USE_NON_PNFS 0x00010000
#define EXCHGID4_FLAG_USE_PNFS_MDS 0x00020000
#define EXCHGID4_FLAG_USE_PNFS_DS 0x00040000
#define EXCHGID4_FLAG_UPD_CONFIRMED_REC_A 0x40000000
#define EXCHGID4_FLAG_CONFIRMED_R 0x80000000
/*
* Since the validity of these bits depends on whether
* they're set in the argument or response, have separate
* invalid flag masks for arg (_A) and resp (_R).
*/
#define EXCHGID4_FLAG_MASK_A 0x40070003
#define EXCHGID4_FLAG_MASK_R 0x80070003
#define SEQ4_STATUS_CB_PATH_DOWN 0x00000001
#define SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRING 0x00000002
#define SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRED 0x00000004
#define SEQ4_STATUS_EXPIRED_ALL_STATE_REVOKED 0x00000008
#define SEQ4_STATUS_EXPIRED_SOME_STATE_REVOKED 0x00000010
#define SEQ4_STATUS_ADMIN_STATE_REVOKED 0x00000020
#define SEQ4_STATUS_RECALLABLE_STATE_REVOKED 0x00000040
#define SEQ4_STATUS_LEASE_MOVED 0x00000080
#define SEQ4_STATUS_RESTART_RECLAIM_NEEDED 0x00000100
#define NFS4_MAX_UINT64 (~(u64)0)
enum nfs4_acl_whotype {
......@@ -154,6 +194,28 @@ enum nfs_opnum4 {
OP_VERIFY = 37,
OP_WRITE = 38,
OP_RELEASE_LOCKOWNER = 39,
/* nfs41 */
OP_BACKCHANNEL_CTL = 40,
OP_BIND_CONN_TO_SESSION = 41,
OP_EXCHANGE_ID = 42,
OP_CREATE_SESSION = 43,
OP_DESTROY_SESSION = 44,
OP_FREE_STATEID = 45,
OP_GET_DIR_DELEGATION = 46,
OP_GETDEVICEINFO = 47,
OP_GETDEVICELIST = 48,
OP_LAYOUTCOMMIT = 49,
OP_LAYOUTGET = 50,
OP_LAYOUTRETURN = 51,
OP_SECINFO_NO_NAME = 52,
OP_SEQUENCE = 53,
OP_SET_SSV = 54,
OP_TEST_STATEID = 55,
OP_WANT_DELEGATION = 56,
OP_DESTROY_CLIENTID = 57,
OP_RECLAIM_COMPLETE = 58,
OP_ILLEGAL = 10044,
};
......@@ -230,7 +292,48 @@ enum nfsstat4 {
NFS4ERR_DEADLOCK = 10045,
NFS4ERR_FILE_OPEN = 10046,
NFS4ERR_ADMIN_REVOKED = 10047,
NFS4ERR_CB_PATH_DOWN = 10048
NFS4ERR_CB_PATH_DOWN = 10048,
/* nfs41 */
NFS4ERR_BADIOMODE = 10049,
NFS4ERR_BADLAYOUT = 10050,
NFS4ERR_BAD_SESSION_DIGEST = 10051,
NFS4ERR_BADSESSION = 10052,
NFS4ERR_BADSLOT = 10053,
NFS4ERR_COMPLETE_ALREADY = 10054,
NFS4ERR_CONN_NOT_BOUND_TO_SESSION = 10055,
NFS4ERR_DELEG_ALREADY_WANTED = 10056,
NFS4ERR_BACK_CHAN_BUSY = 10057, /* backchan reqs outstanding */
NFS4ERR_LAYOUTTRYLATER = 10058,
NFS4ERR_LAYOUTUNAVAILABLE = 10059,
NFS4ERR_NOMATCHING_LAYOUT = 10060,
NFS4ERR_RECALLCONFLICT = 10061,
NFS4ERR_UNKNOWN_LAYOUTTYPE = 10062,
NFS4ERR_SEQ_MISORDERED = 10063, /* unexpected seq.id in req */
NFS4ERR_SEQUENCE_POS = 10064, /* [CB_]SEQ. op not 1st op */
NFS4ERR_REQ_TOO_BIG = 10065, /* request too big */
NFS4ERR_REP_TOO_BIG = 10066, /* reply too big */
NFS4ERR_REP_TOO_BIG_TO_CACHE = 10067, /* rep. not all cached */
NFS4ERR_RETRY_UNCACHED_REP = 10068, /* retry & rep. uncached */
NFS4ERR_UNSAFE_COMPOUND = 10069, /* retry/recovery too hard */
NFS4ERR_TOO_MANY_OPS = 10070, /* too many ops in [CB_]COMP */
NFS4ERR_OP_NOT_IN_SESSION = 10071, /* op needs [CB_]SEQ. op */
NFS4ERR_HASH_ALG_UNSUPP = 10072, /* hash alg. not supp. */
/* Error 10073 is unused. */
NFS4ERR_CLIENTID_BUSY = 10074, /* clientid has state */
NFS4ERR_PNFS_IO_HOLE = 10075, /* IO to _SPARSE file hole */
NFS4ERR_SEQ_FALSE_RETRY = 10076, /* retry not origional */
NFS4ERR_BAD_HIGH_SLOT = 10077, /* sequence arg bad */
NFS4ERR_DEADSESSION = 10078, /* persistent session dead */
NFS4ERR_ENCR_ALG_UNSUPP = 10079, /* SSV alg mismatch */
NFS4ERR_PNFS_NO_LAYOUT = 10080, /* direct I/O with no layout */
NFS4ERR_NOT_ONLY_OP = 10081, /* bad compound */
NFS4ERR_WRONG_CRED = 10082, /* permissions:state change */
NFS4ERR_WRONG_TYPE = 10083, /* current operation mismatch */
NFS4ERR_DIRDELEG_UNAVAIL = 10084, /* no directory delegation */
NFS4ERR_REJECT_DELEG = 10085, /* on callback */
NFS4ERR_RETURNCONFLICT = 10086, /* outstanding layoutreturn */
NFS4ERR_DELEG_REVOKED = 10087, /* deleg./layout revoked */
};
/*
......@@ -265,7 +368,13 @@ enum opentype4 {
enum createmode4 {
NFS4_CREATE_UNCHECKED = 0,
NFS4_CREATE_GUARDED = 1,
NFS4_CREATE_EXCLUSIVE = 2
NFS4_CREATE_EXCLUSIVE = 2,
/*
* New to NFSv4.1. If session is persistent,
* GUARDED4 MUST be used. Otherwise, use
* EXCLUSIVE4_1 instead of EXCLUSIVE4.
*/
NFS4_CREATE_EXCLUSIVE4_1 = 3
};
enum limit_by4 {
......@@ -301,6 +410,8 @@ enum lock_type4 {
#define FATTR4_WORD0_UNIQUE_HANDLES (1UL << 9)
#define FATTR4_WORD0_LEASE_TIME (1UL << 10)
#define FATTR4_WORD0_RDATTR_ERROR (1UL << 11)
/* Mandatory in NFSv4.1 */
#define FATTR4_WORD2_SUPPATTR_EXCLCREAT (1UL << 11)
/* Recommended Attributes */
#define FATTR4_WORD0_ACL (1UL << 12)
......@@ -391,6 +502,29 @@ enum {
NFSPROC4_CLNT_GETACL,
NFSPROC4_CLNT_SETACL,
NFSPROC4_CLNT_FS_LOCATIONS,
/* nfs41 */
NFSPROC4_CLNT_EXCHANGE_ID,
NFSPROC4_CLNT_CREATE_SESSION,
NFSPROC4_CLNT_DESTROY_SESSION,
NFSPROC4_CLNT_SEQUENCE,
NFSPROC4_CLNT_GET_LEASE_TIME,
};
/* nfs41 types */
struct nfs4_sessionid {
unsigned char data[NFS4_MAX_SESSIONID_LEN];
};
/* Create Session Flags */
#define SESSION4_PERSIST 0x001
#define SESSION4_BACK_CHAN 0x002
#define SESSION4_RDMA 0x004
enum state_protect_how4 {
SP4_NONE = 0,
SP4_MACH_CRED = 1,
SP4_SSV = 2
};
#endif
......
......@@ -76,4 +76,12 @@ void nfsd_reply_cache_shutdown(void);
int nfsd_cache_lookup(struct svc_rqst *, int);
void nfsd_cache_update(struct svc_rqst *, int, __be32 *);
#ifdef CONFIG_NFSD_V4
void nfsd4_set_statp(struct svc_rqst *rqstp, __be32 *statp);
#else /* CONFIG_NFSD_V4 */
static inline void nfsd4_set_statp(struct svc_rqst *rqstp, __be32 *statp)
{
}
#endif /* CONFIG_NFSD_V4 */
#endif /* NFSCACHE_H */
This diff is collapsed.
......@@ -269,6 +269,13 @@ fh_copy(struct svc_fh *dst, struct svc_fh *src)
return dst;
}
static inline void
fh_copy_shallow(struct knfsd_fh *dst, struct knfsd_fh *src)
{
dst->fh_size = src->fh_size;
memcpy(&dst->fh_base, &src->fh_base, src->fh_size);
}
static __inline__ struct svc_fh *
fh_init(struct svc_fh *fhp, int maxsize)
{
......
......@@ -66,8 +66,7 @@ struct nfs4_cb_recall {
u32 cbr_ident;
int cbr_trunc;
stateid_t cbr_stateid;
u32 cbr_fhlen;
char cbr_fhval[NFS4_FHSIZE];
struct knfsd_fh cbr_fh;
struct nfs4_delegation *cbr_dp;
};
......@@ -86,8 +85,7 @@ struct nfs4_delegation {
};
#define dl_stateid dl_recall.cbr_stateid
#define dl_fhlen dl_recall.cbr_fhlen
#define dl_fhval dl_recall.cbr_fhval
#define dl_fh dl_recall.cbr_fh
/* client delegation callback info */
struct nfs4_callback {
......@@ -101,6 +99,64 @@ struct nfs4_callback {
struct rpc_clnt * cb_client;
};
/* Maximum number of slots per session. 128 is useful for long haul TCP */
#define NFSD_MAX_SLOTS_PER_SESSION 128
/* Maximum number of pages per slot cache entry */
#define NFSD_PAGES_PER_SLOT 1
/* Maximum number of operations per session compound */
#define NFSD_MAX_OPS_PER_COMPOUND 16
struct nfsd4_cache_entry {
__be32 ce_status;
struct kvec ce_datav; /* encoded NFSv4.1 data in rq_res.head[0] */
struct page *ce_respages[NFSD_PAGES_PER_SLOT + 1];
int ce_cachethis;
short ce_resused;
int ce_opcnt;
int ce_rpchdrlen;
};
struct nfsd4_slot {
bool sl_inuse;
u32 sl_seqid;
struct nfsd4_cache_entry sl_cache_entry;
};
struct nfsd4_session {
struct kref se_ref;
struct list_head se_hash; /* hash by sessionid */
struct list_head se_perclnt;
u32 se_flags;
struct nfs4_client *se_client; /* for expire_client */
struct nfs4_sessionid se_sessionid;
u32 se_fmaxreq_sz;
u32 se_fmaxresp_sz;
u32 se_fmaxresp_cached;
u32 se_fmaxops;
u32 se_fnumslots;
struct nfsd4_slot se_slots[]; /* forward channel slots */
};
static inline void
nfsd4_put_session(struct nfsd4_session *ses)
{
extern void free_session(struct kref *kref);
kref_put(&ses->se_ref, free_session);
}
static inline void
nfsd4_get_session(struct nfsd4_session *ses)
{
kref_get(&ses->se_ref);
}
/* formatted contents of nfs4_sessionid */
struct nfsd4_sessionid {
clientid_t clientid;
u32 sequence;
u32 reserved;
};
#define HEXDIR_LEN 33 /* hex version of 16 byte md5 of cl_name plus '\0' */
/*
......@@ -132,6 +188,12 @@ struct nfs4_client {
struct nfs4_callback cl_callback; /* callback info */
atomic_t cl_count; /* ref count */
u32 cl_firststate; /* recovery dir creation */
/* for nfs41 */
struct list_head cl_sessions;
struct nfsd4_slot cl_slot; /* create_session slot */
u32 cl_exchange_flags;
struct nfs4_sessionid cl_sessionid;
};
/* struct nfs4_client_reset
......@@ -168,8 +230,7 @@ struct nfs4_replay {
unsigned int rp_buflen;
char *rp_buf;
unsigned intrp_allocated;
int rp_openfh_len;
char rp_openfh[NFS4_FHSIZE];
struct knfsd_fh rp_openfh;
char rp_ibuf[NFSD4_REPLAY_ISIZE];
};
......@@ -217,7 +278,7 @@ struct nfs4_stateowner {
* share_acces, share_deny on the file.
*/
struct nfs4_file {
struct kref fi_ref;
atomic_t fi_ref;
struct list_head fi_hash; /* hash by "struct inode *" */
struct list_head fi_stateids;
struct list_head fi_delegations;
......@@ -259,14 +320,13 @@ struct nfs4_stateid {
};
/* flags for preprocess_seqid_op() */
#define CHECK_FH 0x00000001
#define HAS_SESSION 0x00000001
#define CONFIRM 0x00000002
#define OPEN_STATE 0x00000004
#define LOCK_STATE 0x00000008
#define RD_STATE 0x00000010
#define WR_STATE 0x00000020
#define CLOSE_STATE 0x00000040
#define DELEG_RET 0x00000080
#define seqid_mutating_err(err) \
(((err) != nfserr_stale_clientid) && \
......@@ -274,7 +334,9 @@ struct nfs4_stateid {
((err) != nfserr_stale_stateid) && \
((err) != nfserr_bad_stateid))
extern __be32 nfs4_preprocess_stateid_op(struct svc_fh *current_fh,
struct nfsd4_compound_state;
extern __be32 nfs4_preprocess_stateid_op(struct nfsd4_compound_state *cstate,
stateid_t *stateid, int flags, struct file **filp);
extern void nfs4_lock_state(void);
extern void nfs4_unlock_state(void);
......@@ -290,7 +352,7 @@ extern void nfsd4_init_recdir(char *recdir_name);
extern int nfsd4_recdir_load(void);
extern void nfsd4_shutdown_recdir(void);
extern int nfs4_client_to_reclaim(const char *name);
extern int nfs4_has_reclaimed_state(const char *name);
extern int nfs4_has_reclaimed_state(const char *name, bool use_exchange_id);
extern void nfsd4_recdir_purge_old(void);
extern int nfsd4_create_clid_dir(struct nfs4_client *clp);
extern void nfsd4_remove_clid_dir(struct nfs4_client *clp);
......
......@@ -11,6 +11,11 @@
#include <linux/nfs4.h>
/* thread usage wraps very million seconds (approx one fortnight) */
#define NFSD_USAGE_WRAP (HZ*1000000)
#ifdef __KERNEL__
struct nfsd_stats {
unsigned int rchits; /* repcache hits */
unsigned int rcmisses; /* repcache hits */
......@@ -35,10 +40,6 @@ struct nfsd_stats {
};
/* thread usage wraps very million seconds (approx one fortnight) */
#define NFSD_USAGE_WRAP (HZ*1000000)
#ifdef __KERNEL__
extern struct nfsd_stats nfsdstats;
extern struct svc_stat nfsd_svcstats;
......
......@@ -48,8 +48,20 @@ struct nfsd4_compound_state {
struct svc_fh current_fh;
struct svc_fh save_fh;
struct nfs4_stateowner *replay_owner;
/* For sessions DRC */
struct nfsd4_session *session;
struct nfsd4_slot *slot;
__be32 *statp;
size_t iovlen;
u32 minorversion;
u32 status;
};
static inline bool nfsd4_has_session(struct nfsd4_compound_state *cs)
{
return cs->slot != NULL;
}
struct nfsd4_change_info {
u32 atomic;
u32 before_ctime_sec;
......@@ -90,7 +102,7 @@ struct nfsd4_create {
u32 specdata2;
} dev; /* NF4BLK, NF4CHR */
} u;
u32 cr_bmval[2]; /* request */
u32 cr_bmval[3]; /* request */
struct iattr cr_iattr; /* request */
struct nfsd4_change_info cr_cinfo; /* response */
struct nfs4_acl *cr_acl;
......@@ -105,7 +117,7 @@ struct nfsd4_delegreturn {
};
struct nfsd4_getattr {
u32 ga_bmval[2]; /* request */
u32 ga_bmval[3]; /* request */
struct svc_fh *ga_fhp; /* response */
};
......@@ -206,11 +218,9 @@ struct nfsd4_open {
stateid_t op_delegate_stateid; /* request - response */
u32 op_create; /* request */
u32 op_createmode; /* request */
u32 op_bmval[2]; /* request */
union { /* request */
struct iattr iattr; /* UNCHECKED4,GUARDED4 */
u32 op_bmval[3]; /* request */
struct iattr iattr; /* UNCHECKED4, GUARDED4, EXCLUSIVE4_1 */
nfs4_verifier verf; /* EXCLUSIVE4 */
} u;
clientid_t op_clientid; /* request */
struct xdr_netobj op_owner; /* request */
u32 op_seqid; /* request */
......@@ -224,8 +234,8 @@ struct nfsd4_open {
struct nfs4_stateowner *op_stateowner; /* used during processing */
struct nfs4_acl *op_acl;
};
#define op_iattr u.iattr
#define op_verf u.verf
#define op_iattr iattr
#define op_verf verf
struct nfsd4_open_confirm {
stateid_t oc_req_stateid /* request */;
......@@ -259,7 +269,7 @@ struct nfsd4_readdir {
nfs4_verifier rd_verf; /* request */
u32 rd_dircount; /* request */
u32 rd_maxcount; /* request */
u32 rd_bmval[2]; /* request */
u32 rd_bmval[3]; /* request */
struct svc_rqst *rd_rqstp; /* response */
struct svc_fh * rd_fhp; /* response */
......@@ -301,7 +311,7 @@ struct nfsd4_secinfo {
struct nfsd4_setattr {
stateid_t sa_stateid; /* request */
u32 sa_bmval[2]; /* request */
u32 sa_bmval[3]; /* request */
struct iattr sa_iattr; /* request */
struct nfs4_acl *sa_acl;
};
......@@ -327,7 +337,7 @@ struct nfsd4_setclientid_confirm {
/* also used for NVERIFY */
struct nfsd4_verify {
u32 ve_bmval[2]; /* request */
u32 ve_bmval[3]; /* request */
u32 ve_attrlen; /* request */
char * ve_attrval; /* request */
};
......@@ -344,6 +354,54 @@ struct nfsd4_write {
nfs4_verifier wr_verifier; /* response */
};
struct nfsd4_exchange_id {
nfs4_verifier verifier;
struct xdr_netobj clname;
u32 flags;
clientid_t clientid;
u32 seqid;
int spa_how;
};
struct nfsd4_channel_attrs {
u32 headerpadsz;
u32 maxreq_sz;
u32 maxresp_sz;
u32 maxresp_cached;
u32 maxops;
u32 maxreqs;
u32 nr_rdma_attrs;
u32 rdma_attrs;
};
struct nfsd4_create_session {
clientid_t clientid;
struct nfs4_sessionid sessionid;
u32 seqid;
u32 flags;
struct nfsd4_channel_attrs fore_channel;
struct nfsd4_channel_attrs back_channel;
u32 callback_prog;
u32 uid;
u32 gid;
};
struct nfsd4_sequence {
struct nfs4_sessionid sessionid; /* request/response */
u32 seqid; /* request/response */
u32 slotid; /* request/response */
u32 maxslots; /* request/response */
u32 cachethis; /* request */
#if 0
u32 target_maxslots; /* response */
u32 status_flags; /* response */
#endif /* not yet */
};
struct nfsd4_destroy_session {
struct nfs4_sessionid sessionid;
};
struct nfsd4_op {
int opnum;
__be32 status;
......@@ -378,6 +436,12 @@ struct nfsd4_op {
struct nfsd4_verify verify;
struct nfsd4_write write;
struct nfsd4_release_lockowner release_lockowner;
/* NFSv4.1 */
struct nfsd4_exchange_id exchange_id;
struct nfsd4_create_session create_session;
struct nfsd4_destroy_session destroy_session;
struct nfsd4_sequence sequence;
} u;
struct nfs4_replay * replay;
};
......@@ -416,9 +480,22 @@ struct nfsd4_compoundres {
u32 taglen;
char * tag;
u32 opcnt;
__be32 * tagp; /* where to encode tag and opcount */
__be32 * tagp; /* tag, opcount encode location */
struct nfsd4_compound_state cstate;
};
static inline bool nfsd4_is_solo_sequence(struct nfsd4_compoundres *resp)
{
struct nfsd4_compoundargs *args = resp->rqstp->rq_argp;
return args->opcnt == 1;
}
static inline bool nfsd4_not_cached(struct nfsd4_compoundres *resp)
{
return !resp->cstate.slot->sl_cache_entry.ce_cachethis ||
nfsd4_is_solo_sequence(resp);
}
#define NFS4_SVC_XDRSIZE sizeof(struct nfsd4_compoundargs)
static inline void
......@@ -448,7 +525,23 @@ extern __be32 nfsd4_setclientid(struct svc_rqst *rqstp,
extern __be32 nfsd4_setclientid_confirm(struct svc_rqst *rqstp,
struct nfsd4_compound_state *,
struct nfsd4_setclientid_confirm *setclientid_confirm);
extern __be32 nfsd4_process_open1(struct nfsd4_open *open);
extern void nfsd4_store_cache_entry(struct nfsd4_compoundres *resp);
extern __be32 nfsd4_replay_cache_entry(struct nfsd4_compoundres *resp,
struct nfsd4_sequence *seq);
extern __be32 nfsd4_exchange_id(struct svc_rqst *rqstp,
struct nfsd4_compound_state *,
struct nfsd4_exchange_id *);
extern __be32 nfsd4_create_session(struct svc_rqst *,
struct nfsd4_compound_state *,
struct nfsd4_create_session *);
extern __be32 nfsd4_sequence(struct svc_rqst *,
struct nfsd4_compound_state *,
struct nfsd4_sequence *);
extern __be32 nfsd4_destroy_session(struct svc_rqst *,
struct nfsd4_compound_state *,
struct nfsd4_destroy_session *);
extern __be32 nfsd4_process_open1(struct nfsd4_compound_state *,
struct nfsd4_open *open);
extern __be32 nfsd4_process_open2(struct svc_rqst *rqstp,
struct svc_fh *current_fh, struct nfsd4_open *open);
extern __be32 nfsd4_open_confirm(struct svc_rqst *rqstp,
......
......@@ -24,6 +24,15 @@
*/
typedef int (*svc_thread_fn)(void *);
/* statistics for svc_pool structures */
struct svc_pool_stats {
unsigned long packets;
unsigned long sockets_queued;
unsigned long threads_woken;
unsigned long overloads_avoided;
unsigned long threads_timedout;
};
/*
*
* RPC service thread pool.
......@@ -41,6 +50,8 @@ struct svc_pool {
struct list_head sp_sockets; /* pending sockets */
unsigned int sp_nrthreads; /* # of threads in pool */
struct list_head sp_all_threads; /* all server threads */
int sp_nwaking; /* number of threads woken but not yet active */
struct svc_pool_stats sp_stats; /* statistics on pool operation */
} ____cacheline_aligned_in_smp;
/*
......@@ -83,6 +94,8 @@ struct svc_serv {
struct module * sv_module; /* optional module to count when
* adding threads */
svc_thread_fn sv_function; /* main function for threads */
unsigned int sv_drc_max_pages; /* Total pages for DRC */
unsigned int sv_drc_pages_used;/* DRC pages used */
};
/*
......@@ -218,6 +231,7 @@ struct svc_rqst {
struct svc_cred rq_cred; /* auth info */
void * rq_xprt_ctxt; /* transport specific context ptr */
struct svc_deferred_req*rq_deferred; /* deferred request we are replaying */
int rq_usedeferral; /* use deferral */
size_t rq_xprt_hlen; /* xprt header len */
struct xdr_buf rq_arg;
......@@ -263,6 +277,7 @@ struct svc_rqst {
* cache pages */
wait_queue_head_t rq_wait; /* synchronization */
struct task_struct *rq_task; /* service thread */
int rq_waking; /* 1 if thread is being woken */
};
/*
......@@ -393,6 +408,7 @@ struct svc_serv * svc_create_pooled(struct svc_program *, unsigned int,
void (*shutdown)(struct svc_serv *),
svc_thread_fn, struct module *);
int svc_set_num_threads(struct svc_serv *, struct svc_pool *, int);
int svc_pool_stats_open(struct svc_serv *serv, struct file *file);
void svc_destroy(struct svc_serv *);
int svc_process(struct svc_rqst *);
int svc_register(const struct svc_serv *, const int,
......
......@@ -69,27 +69,27 @@ struct xdr_buf {
* pre-xdr'ed macros.
*/
#define xdr_zero __constant_htonl(0)
#define xdr_one __constant_htonl(1)
#define xdr_two __constant_htonl(2)
#define rpc_success __constant_htonl(RPC_SUCCESS)
#define rpc_prog_unavail __constant_htonl(RPC_PROG_UNAVAIL)
#define rpc_prog_mismatch __constant_htonl(RPC_PROG_MISMATCH)
#define rpc_proc_unavail __constant_htonl(RPC_PROC_UNAVAIL)
#define rpc_garbage_args __constant_htonl(RPC_GARBAGE_ARGS)
#define rpc_system_err __constant_htonl(RPC_SYSTEM_ERR)
#define rpc_drop_reply __constant_htonl(RPC_DROP_REPLY)
#define rpc_auth_ok __constant_htonl(RPC_AUTH_OK)
#define rpc_autherr_badcred __constant_htonl(RPC_AUTH_BADCRED)
#define rpc_autherr_rejectedcred __constant_htonl(RPC_AUTH_REJECTEDCRED)
#define rpc_autherr_badverf __constant_htonl(RPC_AUTH_BADVERF)
#define rpc_autherr_rejectedverf __constant_htonl(RPC_AUTH_REJECTEDVERF)
#define rpc_autherr_tooweak __constant_htonl(RPC_AUTH_TOOWEAK)
#define rpcsec_gsserr_credproblem __constant_htonl(RPCSEC_GSS_CREDPROBLEM)
#define rpcsec_gsserr_ctxproblem __constant_htonl(RPCSEC_GSS_CTXPROBLEM)
#define rpc_autherr_oldseqnum __constant_htonl(101)
#define xdr_zero cpu_to_be32(0)
#define xdr_one cpu_to_be32(1)
#define xdr_two cpu_to_be32(2)
#define rpc_success cpu_to_be32(RPC_SUCCESS)
#define rpc_prog_unavail cpu_to_be32(RPC_PROG_UNAVAIL)
#define rpc_prog_mismatch cpu_to_be32(RPC_PROG_MISMATCH)
#define rpc_proc_unavail cpu_to_be32(RPC_PROC_UNAVAIL)
#define rpc_garbage_args cpu_to_be32(RPC_GARBAGE_ARGS)
#define rpc_system_err cpu_to_be32(RPC_SYSTEM_ERR)
#define rpc_drop_reply cpu_to_be32(RPC_DROP_REPLY)
#define rpc_auth_ok cpu_to_be32(RPC_AUTH_OK)
#define rpc_autherr_badcred cpu_to_be32(RPC_AUTH_BADCRED)
#define rpc_autherr_rejectedcred cpu_to_be32(RPC_AUTH_REJECTEDCRED)
#define rpc_autherr_badverf cpu_to_be32(RPC_AUTH_BADVERF)
#define rpc_autherr_rejectedverf cpu_to_be32(RPC_AUTH_REJECTEDVERF)
#define rpc_autherr_tooweak cpu_to_be32(RPC_AUTH_TOOWEAK)
#define rpcsec_gsserr_credproblem cpu_to_be32(RPCSEC_GSS_CREDPROBLEM)
#define rpcsec_gsserr_ctxproblem cpu_to_be32(RPCSEC_GSS_CTXPROBLEM)
#define rpc_autherr_oldseqnum cpu_to_be32(101)
/*
* Miscellaneous XDR helper functions
......
......@@ -1008,6 +1008,8 @@ svc_process(struct svc_rqst *rqstp)
rqstp->rq_res.tail[0].iov_len = 0;
/* Will be turned off only in gss privacy case: */
rqstp->rq_splice_ok = 1;
/* Will be turned off only when NFSv4 Sessions are used */
rqstp->rq_usedeferral = 1;
/* Setup reply header */
rqstp->rq_xprt->xpt_ops->xpo_prep_reply_hdr(rqstp);
......@@ -1078,7 +1080,6 @@ svc_process(struct svc_rqst *rqstp)
procp = versp->vs_proc + proc;
if (proc >= versp->vs_nproc || !procp->pc_func)
goto err_bad_proc;
rqstp->rq_server = serv;
rqstp->rq_procinfo = procp;
/* Syntactic check complete */
......
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment