Commit ec23eb54 authored by Mauro Carvalho Chehab's avatar Mauro Carvalho Chehab Committed by Jonathan Corbet

docs: fs: convert docs without extension to ReST

There are 3 remaining files without an extension inside the fs docs
dir.

Manually convert them to ReST.

In the case of the nfs/exporting.rst file, as the nfs docs
aren't ported yet, I opted to convert and add a :orphan: there,
with should be removed when it gets added into a nfs-specific
part of the fs documentation.
Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: default avatarJonathan Corbet <corbet@lwn.net>
parent 5a5e045b
Locking scheme used for directory operations is based on two
=================
Directory Locking
=================
Locking scheme used for directory operations is based on two
kinds of locks - per-inode (->i_rwsem) and per-filesystem
(->s_vfs_rename_mutex).
When taking the i_rwsem on multiple non-directory objects, we
When taking the i_rwsem on multiple non-directory objects, we
always acquire the locks in order by increasing address. We'll call
that "inode pointer" order in the following.
For our purposes all operations fall in 5 classes:
For our purposes all operations fall in 5 classes:
1) read access. Locking rules: caller locks directory we are accessing.
The lock is taken shared.
......@@ -27,25 +32,29 @@ NB: we might get away with locking the the source (and target in exchange
case) shared.
5) link creation. Locking rules:
* lock parent
* check that source is not a directory
* lock source
* call the method.
All locks are exclusive.
6) cross-directory rename. The trickiest in the whole bunch. Locking
rules:
* lock the filesystem
* lock parents in "ancestors first" order.
* find source and target.
* if old parent is equal to or is a descendent of target
fail with -ENOTEMPTY
fail with -ENOTEMPTY
* if new parent is equal to or is a descendent of source
fail with -ELOOP
fail with -ELOOP
* If it's an exchange, lock both the source and the target.
* If the target exists, lock it. If the source is a non-directory,
lock it. If we need to lock both, do so in inode pointer order.
* call the method.
All ->i_rwsem are taken exclusive. Again, we might get away with locking
the the source (and target in exchange case) shared.
......@@ -54,10 +63,11 @@ read, modified or removed by method will be locked by caller.
If no directory is its own ancestor, the scheme above is deadlock-free.
Proof:
First of all, at any moment we have a partial ordering of the
objects - A < B iff A is an ancestor of B.
objects - A < B iff A is an ancestor of B.
That ordering can change. However, the following is true:
......@@ -77,32 +87,32 @@ objects - A < B iff A is an ancestor of B.
non-directory object, except renames, which take locks on source and
target in inode pointer order in the case they are not directories.)
Now consider the minimal deadlock. Each process is blocked on
Now consider the minimal deadlock. Each process is blocked on
attempt to acquire some lock and already holds at least one lock. Let's
consider the set of contended locks. First of all, filesystem lock is
not contended, since any process blocked on it is not holding any locks.
Thus all processes are blocked on ->i_rwsem.
By (3), any process holding a non-directory lock can only be
By (3), any process holding a non-directory lock can only be
waiting on another non-directory lock with a larger address. Therefore
the process holding the "largest" such lock can always make progress, and
non-directory objects are not included in the set of contended locks.
Thus link creation can't be a part of deadlock - it can't be
Thus link creation can't be a part of deadlock - it can't be
blocked on source and it means that it doesn't hold any locks.
Any contended object is either held by cross-directory rename or
Any contended object is either held by cross-directory rename or
has a child that is also contended. Indeed, suppose that it is held by
operation other than cross-directory rename. Then the lock this operation
is blocked on belongs to child of that object due to (1).
It means that one of the operations is cross-directory rename.
It means that one of the operations is cross-directory rename.
Otherwise the set of contended objects would be infinite - each of them
would have a contended child and we had assumed that no object is its
own descendent. Moreover, there is exactly one cross-directory rename
(see above).
Consider the object blocking the cross-directory rename. One
Consider the object blocking the cross-directory rename. One
of its descendents is locked by cross-directory rename (otherwise we
would again have an infinite set of contended objects). But that
means that cross-directory rename is taking locks out of order. Due
......@@ -112,7 +122,7 @@ try to acquire lock on descendent before the lock on ancestor.
Contradiction. I.e. deadlock is impossible. Q.E.D.
These operations are guaranteed to avoid loop creation. Indeed,
These operations are guaranteed to avoid loop creation. Indeed,
the only operation that could introduce loops is cross-directory rename.
Since the only new (parent, child) pair added by rename() is (new parent,
source), such loop would have to contain these objects and the rest of it
......@@ -123,13 +133,13 @@ new parent had been equal to or a descendent of source since the moment when
we had acquired filesystem lock and rename() would fail with -ELOOP in that
case.
While this locking scheme works for arbitrary DAGs, it relies on
While this locking scheme works for arbitrary DAGs, it relies on
ability to check that directory is a descendent of another object. Current
implementation assumes that directory graph is a tree. This assumption is
also preserved by all operations (cross-directory rename on a tree that would
not introduce a cycle will leave it a tree and link() fails for directories).
Notice that "directory" in the above == "anything that might have
Notice that "directory" in the above == "anything that might have
children", so if we are going to introduce hybrid objects we will need
either to make sure that link(2) doesn't work for them or to make changes
in is_subdir() that would make it work even in presence of such beasts.
......@@ -20,6 +20,8 @@ algorithms work.
path-lookup
api-summary
splice
locking
directory-locking
Filesystem support layers
=========================
......
:orphan:
Making Filesystems Exportable
=============================
......@@ -42,9 +43,9 @@ filehandle fragment, there is no automatic creation of a path prefix
for the object. This leads to two related but distinct features of
the dcache that are not needed for normal filesystem access.
1/ The dcache must sometimes contain objects that are not part of the
1. The dcache must sometimes contain objects that are not part of the
proper prefix. i.e that are not connected to the root.
2/ The dcache must be prepared for a newly found (via ->lookup) directory
2. The dcache must be prepared for a newly found (via ->lookup) directory
to already have a (non-connected) dentry, and must be able to move
that dentry into place (based on the parent and name in the
->lookup). This is particularly needed for directories as
......@@ -52,7 +53,7 @@ the dcache that are not needed for normal filesystem access.
To implement these features, the dcache has:
a/ A dentry flag DCACHE_DISCONNECTED which is set on
a. A dentry flag DCACHE_DISCONNECTED which is set on
any dentry that might not be part of the proper prefix.
This is set when anonymous dentries are created, and cleared when a
dentry is noticed to be a child of a dentry which is in the proper
......@@ -71,48 +72,52 @@ a/ A dentry flag DCACHE_DISCONNECTED which is set on
dentries. That guarantees that we won't need to hunt them down upon
umount.
b/ A primitive for creation of secondary roots - d_obtain_root(inode).
b. A primitive for creation of secondary roots - d_obtain_root(inode).
Those do _not_ bear DCACHE_DISCONNECTED. They are placed on the
per-superblock list (->s_roots), so they can be located at umount
time for eviction purposes.
c/ Helper routines to allocate anonymous dentries, and to help attach
c. Helper routines to allocate anonymous dentries, and to help attach
loose directory dentries at lookup time. They are:
d_obtain_alias(inode) will return a dentry for the given inode.
If the inode already has a dentry, one of those is returned.
If it doesn't, a new anonymous (IS_ROOT and
DCACHE_DISCONNECTED) dentry is allocated and attached.
DCACHE_DISCONNECTED) dentry is allocated and attached.
In the case of a directory, care is taken that only one dentry
can ever be attached.
d_splice_alias(inode, dentry) will introduce a new dentry into the tree;
either the passed-in dentry or a preexisting alias for the given inode
(such as an anonymous one created by d_obtain_alias), if appropriate.
It returns NULL when the passed-in dentry is used, following the calling
convention of ->lookup.
Filesystem Issues
-----------------
For a filesystem to be exportable it must:
1/ provide the filehandle fragment routines described below.
2/ make sure that d_splice_alias is used rather than d_add
1. provide the filehandle fragment routines described below.
2. make sure that d_splice_alias is used rather than d_add
when ->lookup finds an inode for a given parent and name.
If inode is NULL, d_splice_alias(inode, dentry) is equivalent to
If inode is NULL, d_splice_alias(inode, dentry) is equivalent to::
d_add(dentry, inode), NULL
Similarly, d_splice_alias(ERR_PTR(err), dentry) = ERR_PTR(err)
Typically the ->lookup routine will simply end with a:
Typically the ->lookup routine will simply end with a::
return d_splice_alias(inode, dentry);
}
A file system implementation declares that instances of the filesystem
A file system implementation declares that instances of the filesystem
are exportable by setting the s_export_op field in the struct
super_block. This field must point to a "struct export_operations"
struct which has the following members:
......
......@@ -20,7 +20,7 @@ kernel which allows different filesystem implementations to coexist.
VFS system calls open(2), stat(2), read(2), write(2), chmod(2) and so on
are called from a process context. Filesystem locking is described in
the document Documentation/filesystems/Locking.
the document Documentation/filesystems/locking.rst.
Directory Entry Cache (dcache)
......
......@@ -24,7 +24,7 @@
*/
/*
* See Documentation/filesystems/nfs/Exporting
* See Documentation/filesystems/nfs/exporting.rst
* and examples in fs/exportfs
*
* Since cifs is a network file system, an "fsid" must be included for
......
......@@ -7,7 +7,7 @@
* and for mapping back from file handles to dentries.
*
* For details on why we do all the strange and hairy things in here
* take a look at Documentation/filesystems/nfs/Exporting.
* take a look at Documentation/filesystems/nfs/exporting.rst.
*/
#include <linux/exportfs.h>
#include <linux/fs.h>
......
......@@ -10,7 +10,7 @@
*
* The following files are helpful:
*
* Documentation/filesystems/nfs/Exporting
* Documentation/filesystems/nfs/exporting.rst
* fs/exportfs/expfs.c.
*/
......
......@@ -555,7 +555,7 @@ static int orangefs_fsync(struct file *file,
* Change the file pointer position for an instance of an open file.
*
* \note If .llseek is overriden, we must acquire lock as described in
* Documentation/filesystems/Locking.
* Documentation/filesystems/locking.rst.
*
* Future upgrade could support SEEK_DATA and SEEK_HOLE but would
* require much changes to the FS
......
......@@ -151,7 +151,7 @@ struct dentry_operations {
/*
* Locking rules for dentry_operations callbacks are to be found in
* Documentation/filesystems/Locking. Keep it updated!
* Documentation/filesystems/locking.rst. Keep it updated!
*
* FUrther descriptions are found in Documentation/filesystems/vfs.rst.
* Keep it updated too!
......
......@@ -139,7 +139,7 @@ struct fid {
* @get_parent: find the parent of a given directory
* @commit_metadata: commit metadata changes to stable storage
*
* See Documentation/filesystems/nfs/Exporting for details on how to use
* See Documentation/filesystems/nfs/exporting.rst for details on how to use
* this interface correctly.
*
* encode_fh:
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment