Commit ec23eb54 authored by Mauro Carvalho Chehab's avatar Mauro Carvalho Chehab Committed by Jonathan Corbet

docs: fs: convert docs without extension to ReST

There are 3 remaining files without an extension inside the fs docs
dir.

Manually convert them to ReST.

In the case of the nfs/exporting.rst file, as the nfs docs
aren't ported yet, I opted to convert and add a :orphan: there,
with should be removed when it gets added into a nfs-specific
part of the fs documentation.
Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: default avatarJonathan Corbet <corbet@lwn.net>
parent 5a5e045b
Locking scheme used for directory operations is based on two =================
Directory Locking
=================
Locking scheme used for directory operations is based on two
kinds of locks - per-inode (->i_rwsem) and per-filesystem kinds of locks - per-inode (->i_rwsem) and per-filesystem
(->s_vfs_rename_mutex). (->s_vfs_rename_mutex).
When taking the i_rwsem on multiple non-directory objects, we When taking the i_rwsem on multiple non-directory objects, we
always acquire the locks in order by increasing address. We'll call always acquire the locks in order by increasing address. We'll call
that "inode pointer" order in the following. that "inode pointer" order in the following.
For our purposes all operations fall in 5 classes: For our purposes all operations fall in 5 classes:
1) read access. Locking rules: caller locks directory we are accessing. 1) read access. Locking rules: caller locks directory we are accessing.
The lock is taken shared. The lock is taken shared.
...@@ -27,25 +32,29 @@ NB: we might get away with locking the the source (and target in exchange ...@@ -27,25 +32,29 @@ NB: we might get away with locking the the source (and target in exchange
case) shared. case) shared.
5) link creation. Locking rules: 5) link creation. Locking rules:
* lock parent * lock parent
* check that source is not a directory * check that source is not a directory
* lock source * lock source
* call the method. * call the method.
All locks are exclusive. All locks are exclusive.
6) cross-directory rename. The trickiest in the whole bunch. Locking 6) cross-directory rename. The trickiest in the whole bunch. Locking
rules: rules:
* lock the filesystem * lock the filesystem
* lock parents in "ancestors first" order. * lock parents in "ancestors first" order.
* find source and target. * find source and target.
* if old parent is equal to or is a descendent of target * if old parent is equal to or is a descendent of target
fail with -ENOTEMPTY fail with -ENOTEMPTY
* if new parent is equal to or is a descendent of source * if new parent is equal to or is a descendent of source
fail with -ELOOP fail with -ELOOP
* If it's an exchange, lock both the source and the target. * If it's an exchange, lock both the source and the target.
* If the target exists, lock it. If the source is a non-directory, * If the target exists, lock it. If the source is a non-directory,
lock it. If we need to lock both, do so in inode pointer order. lock it. If we need to lock both, do so in inode pointer order.
* call the method. * call the method.
All ->i_rwsem are taken exclusive. Again, we might get away with locking All ->i_rwsem are taken exclusive. Again, we might get away with locking
the the source (and target in exchange case) shared. the the source (and target in exchange case) shared.
...@@ -54,10 +63,11 @@ read, modified or removed by method will be locked by caller. ...@@ -54,10 +63,11 @@ read, modified or removed by method will be locked by caller.
If no directory is its own ancestor, the scheme above is deadlock-free. If no directory is its own ancestor, the scheme above is deadlock-free.
Proof: Proof:
First of all, at any moment we have a partial ordering of the First of all, at any moment we have a partial ordering of the
objects - A < B iff A is an ancestor of B. objects - A < B iff A is an ancestor of B.
That ordering can change. However, the following is true: That ordering can change. However, the following is true:
...@@ -77,32 +87,32 @@ objects - A < B iff A is an ancestor of B. ...@@ -77,32 +87,32 @@ objects - A < B iff A is an ancestor of B.
non-directory object, except renames, which take locks on source and non-directory object, except renames, which take locks on source and
target in inode pointer order in the case they are not directories.) target in inode pointer order in the case they are not directories.)
Now consider the minimal deadlock. Each process is blocked on Now consider the minimal deadlock. Each process is blocked on
attempt to acquire some lock and already holds at least one lock. Let's attempt to acquire some lock and already holds at least one lock. Let's
consider the set of contended locks. First of all, filesystem lock is consider the set of contended locks. First of all, filesystem lock is
not contended, since any process blocked on it is not holding any locks. not contended, since any process blocked on it is not holding any locks.
Thus all processes are blocked on ->i_rwsem. Thus all processes are blocked on ->i_rwsem.
By (3), any process holding a non-directory lock can only be By (3), any process holding a non-directory lock can only be
waiting on another non-directory lock with a larger address. Therefore waiting on another non-directory lock with a larger address. Therefore
the process holding the "largest" such lock can always make progress, and the process holding the "largest" such lock can always make progress, and
non-directory objects are not included in the set of contended locks. non-directory objects are not included in the set of contended locks.
Thus link creation can't be a part of deadlock - it can't be Thus link creation can't be a part of deadlock - it can't be
blocked on source and it means that it doesn't hold any locks. blocked on source and it means that it doesn't hold any locks.
Any contended object is either held by cross-directory rename or Any contended object is either held by cross-directory rename or
has a child that is also contended. Indeed, suppose that it is held by has a child that is also contended. Indeed, suppose that it is held by
operation other than cross-directory rename. Then the lock this operation operation other than cross-directory rename. Then the lock this operation
is blocked on belongs to child of that object due to (1). is blocked on belongs to child of that object due to (1).
It means that one of the operations is cross-directory rename. It means that one of the operations is cross-directory rename.
Otherwise the set of contended objects would be infinite - each of them Otherwise the set of contended objects would be infinite - each of them
would have a contended child and we had assumed that no object is its would have a contended child and we had assumed that no object is its
own descendent. Moreover, there is exactly one cross-directory rename own descendent. Moreover, there is exactly one cross-directory rename
(see above). (see above).
Consider the object blocking the cross-directory rename. One Consider the object blocking the cross-directory rename. One
of its descendents is locked by cross-directory rename (otherwise we of its descendents is locked by cross-directory rename (otherwise we
would again have an infinite set of contended objects). But that would again have an infinite set of contended objects). But that
means that cross-directory rename is taking locks out of order. Due means that cross-directory rename is taking locks out of order. Due
...@@ -112,7 +122,7 @@ try to acquire lock on descendent before the lock on ancestor. ...@@ -112,7 +122,7 @@ try to acquire lock on descendent before the lock on ancestor.
Contradiction. I.e. deadlock is impossible. Q.E.D. Contradiction. I.e. deadlock is impossible. Q.E.D.
These operations are guaranteed to avoid loop creation. Indeed, These operations are guaranteed to avoid loop creation. Indeed,
the only operation that could introduce loops is cross-directory rename. the only operation that could introduce loops is cross-directory rename.
Since the only new (parent, child) pair added by rename() is (new parent, Since the only new (parent, child) pair added by rename() is (new parent,
source), such loop would have to contain these objects and the rest of it source), such loop would have to contain these objects and the rest of it
...@@ -123,13 +133,13 @@ new parent had been equal to or a descendent of source since the moment when ...@@ -123,13 +133,13 @@ new parent had been equal to or a descendent of source since the moment when
we had acquired filesystem lock and rename() would fail with -ELOOP in that we had acquired filesystem lock and rename() would fail with -ELOOP in that
case. case.
While this locking scheme works for arbitrary DAGs, it relies on While this locking scheme works for arbitrary DAGs, it relies on
ability to check that directory is a descendent of another object. Current ability to check that directory is a descendent of another object. Current
implementation assumes that directory graph is a tree. This assumption is implementation assumes that directory graph is a tree. This assumption is
also preserved by all operations (cross-directory rename on a tree that would also preserved by all operations (cross-directory rename on a tree that would
not introduce a cycle will leave it a tree and link() fails for directories). not introduce a cycle will leave it a tree and link() fails for directories).
Notice that "directory" in the above == "anything that might have Notice that "directory" in the above == "anything that might have
children", so if we are going to introduce hybrid objects we will need children", so if we are going to introduce hybrid objects we will need
either to make sure that link(2) doesn't work for them or to make changes either to make sure that link(2) doesn't work for them or to make changes
in is_subdir() that would make it work even in presence of such beasts. in is_subdir() that would make it work even in presence of such beasts.
...@@ -20,6 +20,8 @@ algorithms work. ...@@ -20,6 +20,8 @@ algorithms work.
path-lookup path-lookup
api-summary api-summary
splice splice
locking
directory-locking
Filesystem support layers Filesystem support layers
========================= =========================
......
:orphan:
Making Filesystems Exportable Making Filesystems Exportable
============================= =============================
...@@ -42,9 +43,9 @@ filehandle fragment, there is no automatic creation of a path prefix ...@@ -42,9 +43,9 @@ filehandle fragment, there is no automatic creation of a path prefix
for the object. This leads to two related but distinct features of for the object. This leads to two related but distinct features of
the dcache that are not needed for normal filesystem access. the dcache that are not needed for normal filesystem access.
1/ The dcache must sometimes contain objects that are not part of the 1. The dcache must sometimes contain objects that are not part of the
proper prefix. i.e that are not connected to the root. proper prefix. i.e that are not connected to the root.
2/ The dcache must be prepared for a newly found (via ->lookup) directory 2. The dcache must be prepared for a newly found (via ->lookup) directory
to already have a (non-connected) dentry, and must be able to move to already have a (non-connected) dentry, and must be able to move
that dentry into place (based on the parent and name in the that dentry into place (based on the parent and name in the
->lookup). This is particularly needed for directories as ->lookup). This is particularly needed for directories as
...@@ -52,7 +53,7 @@ the dcache that are not needed for normal filesystem access. ...@@ -52,7 +53,7 @@ the dcache that are not needed for normal filesystem access.
To implement these features, the dcache has: To implement these features, the dcache has:
a/ A dentry flag DCACHE_DISCONNECTED which is set on a. A dentry flag DCACHE_DISCONNECTED which is set on
any dentry that might not be part of the proper prefix. any dentry that might not be part of the proper prefix.
This is set when anonymous dentries are created, and cleared when a This is set when anonymous dentries are created, and cleared when a
dentry is noticed to be a child of a dentry which is in the proper dentry is noticed to be a child of a dentry which is in the proper
...@@ -71,48 +72,52 @@ a/ A dentry flag DCACHE_DISCONNECTED which is set on ...@@ -71,48 +72,52 @@ a/ A dentry flag DCACHE_DISCONNECTED which is set on
dentries. That guarantees that we won't need to hunt them down upon dentries. That guarantees that we won't need to hunt them down upon
umount. umount.
b/ A primitive for creation of secondary roots - d_obtain_root(inode). b. A primitive for creation of secondary roots - d_obtain_root(inode).
Those do _not_ bear DCACHE_DISCONNECTED. They are placed on the Those do _not_ bear DCACHE_DISCONNECTED. They are placed on the
per-superblock list (->s_roots), so they can be located at umount per-superblock list (->s_roots), so they can be located at umount
time for eviction purposes. time for eviction purposes.
c/ Helper routines to allocate anonymous dentries, and to help attach c. Helper routines to allocate anonymous dentries, and to help attach
loose directory dentries at lookup time. They are: loose directory dentries at lookup time. They are:
d_obtain_alias(inode) will return a dentry for the given inode. d_obtain_alias(inode) will return a dentry for the given inode.
If the inode already has a dentry, one of those is returned. If the inode already has a dentry, one of those is returned.
If it doesn't, a new anonymous (IS_ROOT and If it doesn't, a new anonymous (IS_ROOT and
DCACHE_DISCONNECTED) dentry is allocated and attached. DCACHE_DISCONNECTED) dentry is allocated and attached.
In the case of a directory, care is taken that only one dentry In the case of a directory, care is taken that only one dentry
can ever be attached. can ever be attached.
d_splice_alias(inode, dentry) will introduce a new dentry into the tree; d_splice_alias(inode, dentry) will introduce a new dentry into the tree;
either the passed-in dentry or a preexisting alias for the given inode either the passed-in dentry or a preexisting alias for the given inode
(such as an anonymous one created by d_obtain_alias), if appropriate. (such as an anonymous one created by d_obtain_alias), if appropriate.
It returns NULL when the passed-in dentry is used, following the calling It returns NULL when the passed-in dentry is used, following the calling
convention of ->lookup. convention of ->lookup.
Filesystem Issues Filesystem Issues
----------------- -----------------
For a filesystem to be exportable it must: For a filesystem to be exportable it must:
1/ provide the filehandle fragment routines described below. 1. provide the filehandle fragment routines described below.
2/ make sure that d_splice_alias is used rather than d_add 2. make sure that d_splice_alias is used rather than d_add
when ->lookup finds an inode for a given parent and name. when ->lookup finds an inode for a given parent and name.
If inode is NULL, d_splice_alias(inode, dentry) is equivalent to If inode is NULL, d_splice_alias(inode, dentry) is equivalent to::
d_add(dentry, inode), NULL d_add(dentry, inode), NULL
Similarly, d_splice_alias(ERR_PTR(err), dentry) = ERR_PTR(err) Similarly, d_splice_alias(ERR_PTR(err), dentry) = ERR_PTR(err)
Typically the ->lookup routine will simply end with a: Typically the ->lookup routine will simply end with a::
return d_splice_alias(inode, dentry); return d_splice_alias(inode, dentry);
} }
A file system implementation declares that instances of the filesystem A file system implementation declares that instances of the filesystem
are exportable by setting the s_export_op field in the struct are exportable by setting the s_export_op field in the struct
super_block. This field must point to a "struct export_operations" super_block. This field must point to a "struct export_operations"
struct which has the following members: struct which has the following members:
......
...@@ -20,7 +20,7 @@ kernel which allows different filesystem implementations to coexist. ...@@ -20,7 +20,7 @@ kernel which allows different filesystem implementations to coexist.
VFS system calls open(2), stat(2), read(2), write(2), chmod(2) and so on VFS system calls open(2), stat(2), read(2), write(2), chmod(2) and so on
are called from a process context. Filesystem locking is described in are called from a process context. Filesystem locking is described in
the document Documentation/filesystems/Locking. the document Documentation/filesystems/locking.rst.
Directory Entry Cache (dcache) Directory Entry Cache (dcache)
......
...@@ -24,7 +24,7 @@ ...@@ -24,7 +24,7 @@
*/ */
/* /*
* See Documentation/filesystems/nfs/Exporting * See Documentation/filesystems/nfs/exporting.rst
* and examples in fs/exportfs * and examples in fs/exportfs
* *
* Since cifs is a network file system, an "fsid" must be included for * Since cifs is a network file system, an "fsid" must be included for
......
...@@ -7,7 +7,7 @@ ...@@ -7,7 +7,7 @@
* and for mapping back from file handles to dentries. * and for mapping back from file handles to dentries.
* *
* For details on why we do all the strange and hairy things in here * For details on why we do all the strange and hairy things in here
* take a look at Documentation/filesystems/nfs/Exporting. * take a look at Documentation/filesystems/nfs/exporting.rst.
*/ */
#include <linux/exportfs.h> #include <linux/exportfs.h>
#include <linux/fs.h> #include <linux/fs.h>
......
...@@ -10,7 +10,7 @@ ...@@ -10,7 +10,7 @@
* *
* The following files are helpful: * The following files are helpful:
* *
* Documentation/filesystems/nfs/Exporting * Documentation/filesystems/nfs/exporting.rst
* fs/exportfs/expfs.c. * fs/exportfs/expfs.c.
*/ */
......
...@@ -555,7 +555,7 @@ static int orangefs_fsync(struct file *file, ...@@ -555,7 +555,7 @@ static int orangefs_fsync(struct file *file,
* Change the file pointer position for an instance of an open file. * Change the file pointer position for an instance of an open file.
* *
* \note If .llseek is overriden, we must acquire lock as described in * \note If .llseek is overriden, we must acquire lock as described in
* Documentation/filesystems/Locking. * Documentation/filesystems/locking.rst.
* *
* Future upgrade could support SEEK_DATA and SEEK_HOLE but would * Future upgrade could support SEEK_DATA and SEEK_HOLE but would
* require much changes to the FS * require much changes to the FS
......
...@@ -151,7 +151,7 @@ struct dentry_operations { ...@@ -151,7 +151,7 @@ struct dentry_operations {
/* /*
* Locking rules for dentry_operations callbacks are to be found in * Locking rules for dentry_operations callbacks are to be found in
* Documentation/filesystems/Locking. Keep it updated! * Documentation/filesystems/locking.rst. Keep it updated!
* *
* FUrther descriptions are found in Documentation/filesystems/vfs.rst. * FUrther descriptions are found in Documentation/filesystems/vfs.rst.
* Keep it updated too! * Keep it updated too!
......
...@@ -139,7 +139,7 @@ struct fid { ...@@ -139,7 +139,7 @@ struct fid {
* @get_parent: find the parent of a given directory * @get_parent: find the parent of a given directory
* @commit_metadata: commit metadata changes to stable storage * @commit_metadata: commit metadata changes to stable storage
* *
* See Documentation/filesystems/nfs/Exporting for details on how to use * See Documentation/filesystems/nfs/exporting.rst for details on how to use
* this interface correctly. * this interface correctly.
* *
* encode_fh: * encode_fh:
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment