Commit c6b80eb8 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'ovl-update-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs

Pull overlayfs update from Miklos Szeredi:

 - Fix failure to copy-up files from certain NFSv4 mounts

 - Sort out inconsistencies between st_ino and i_ino (used in /proc/locks)

 - Allow consistent (POSIX-y) inode numbering in more cases

 - Allow virtiofs to be used as upper layer

 - Miscellaneous cleanups and fixes

* tag 'ovl-update-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: document xino expected behavior
  ovl: enable xino automatically in more cases
  ovl: avoid possible inode number collisions with xino=on
  ovl: use a private non-persistent ino pool
  ovl: fix WARN_ON nlink drop to zero
  ovl: fix a typo in comment
  ovl: replace zero-length array with flexible-array member
  ovl: ovl_obtain_alias(): don't call d_instantiate_anon() for old
  ovl: strict upper fs requirements for remote upper fs
  ovl: check if upper fs supports RENAME_WHITEOUT
  ovl: allow remote upper
  ovl: decide if revalidate needed on a per-dentry basis
  ovl: separate detection of remote upper layer from stacked overlay
  ovl: restructure dentry revalidation
  ovl: ignore failure to copy up unknown xattrs
  ovl: document permission model
  ovl: simplify i_ino initialization
  ovl: factor out helper ovl_get_root()
  ovl: fix out of date comment and unreachable code
  ovl: fix value of i_ino for lower hardlink corner case
parents 9744b923 2eda9eaa
...@@ -40,13 +40,46 @@ On 64bit systems, even if all overlay layers are not on the same ...@@ -40,13 +40,46 @@ On 64bit systems, even if all overlay layers are not on the same
underlying filesystem, the same compliant behavior could be achieved underlying filesystem, the same compliant behavior could be achieved
with the "xino" feature. The "xino" feature composes a unique object with the "xino" feature. The "xino" feature composes a unique object
identifier from the real object st_ino and an underlying fsid index. identifier from the real object st_ino and an underlying fsid index.
If all underlying filesystems support NFS file handles and export file If all underlying filesystems support NFS file handles and export file
handles with 32bit inode number encoding (e.g. ext4), overlay filesystem handles with 32bit inode number encoding (e.g. ext4), overlay filesystem
will use the high inode number bits for fsid. Even when the underlying will use the high inode number bits for fsid. Even when the underlying
filesystem uses 64bit inode numbers, users can still enable the "xino" filesystem uses 64bit inode numbers, users can still enable the "xino"
feature with the "-o xino=on" overlay mount option. That is useful for the feature with the "-o xino=on" overlay mount option. That is useful for the
case of underlying filesystems like xfs and tmpfs, which use 64bit inode case of underlying filesystems like xfs and tmpfs, which use 64bit inode
numbers, but are very unlikely to use the high inode number bit. numbers, but are very unlikely to use the high inode number bits. In case
the underlying inode number does overflow into the high xino bits, overlay
filesystem will fall back to the non xino behavior for that inode.
The following table summarizes what can be expected in different overlay
configurations.
Inode properties
````````````````
+--------------+------------+------------+-----------------+----------------+
|Configuration | Persistent | Uniform | st_ino == d_ino | d_ino == i_ino |
| | st_ino | st_dev | | [*] |
+==============+=====+======+=====+======+========+========+========+=======+
| | dir | !dir | dir | !dir | dir + !dir | dir | !dir |
+--------------+-----+------+-----+------+--------+--------+--------+-------+
| All layers | Y | Y | Y | Y | Y | Y | Y | Y |
| on same fs | | | | | | | | |
+--------------+-----+------+-----+------+--------+--------+--------+-------+
| Layers not | N | Y | Y | N | N | Y | N | Y |
| on same fs, | | | | | | | | |
| xino=off | | | | | | | | |
+--------------+-----+------+-----+------+--------+--------+--------+-------+
| xino=on/auto | Y | Y | Y | Y | Y | Y | Y | Y |
| | | | | | | | | |
+--------------+-----+------+-----+------+--------+--------+--------+-------+
| xino=on/auto,| N | Y | Y | N | N | Y | N | Y |
| ino overflow | | | | | | | | |
+--------------+-----+------+-----+------+--------+--------+--------+-------+
[*] nfsd v3 readdirplus verifies d_ino == i_ino. i_ino is exposed via several
/proc files, such as /proc/locks and /proc/self/fdinfo/<fd> of an inotify
file descriptor.
Upper and Lower Upper and Lower
...@@ -248,6 +281,50 @@ overlay filesystem (though an operation on the name of the file such as ...@@ -248,6 +281,50 @@ overlay filesystem (though an operation on the name of the file such as
rename or unlink will of course be noticed and handled). rename or unlink will of course be noticed and handled).
Permission model
----------------
Permission checking in the overlay filesystem follows these principles:
1) permission check SHOULD return the same result before and after copy up
2) task creating the overlay mount MUST NOT gain additional privileges
3) non-mounting task MAY gain additional privileges through the overlay,
compared to direct access on underlying lower or upper filesystems
This is achieved by performing two permission checks on each access
a) check if current task is allowed access based on local DAC (owner,
group, mode and posix acl), as well as MAC checks
b) check if mounting task would be allowed real operation on lower or
upper layer based on underlying filesystem permissions, again including
MAC checks
Check (a) ensures consistency (1) since owner, group, mode and posix acls
are copied up. On the other hand it can result in server enforced
permissions (used by NFS, for example) being ignored (3).
Check (b) ensures that no task gains permissions to underlying layers that
the mounting task does not have (2). This also means that it is possible
to create setups where the consistency rule (1) does not hold; normally,
however, the mounting task will have sufficient privileges to perform all
operations.
Another way to demonstrate this model is drawing parallels between
mount -t overlay overlay -olowerdir=/lower,upperdir=/upper,... /merged
and
cp -a /lower /upper
mount --bind /upper /merged
The resulting access permissions should be the same. The difference is in
the time of copy (on-demand vs. up-front).
Multiple lower layers Multiple lower layers
--------------------- ---------------------
...@@ -383,7 +460,8 @@ guarantee that the values of st_ino and st_dev returned by stat(2) and the ...@@ -383,7 +460,8 @@ guarantee that the values of st_ino and st_dev returned by stat(2) and the
value of d_ino returned by readdir(3) will act like on a normal filesystem. value of d_ino returned by readdir(3) will act like on a normal filesystem.
E.g. the value of st_dev may be different for two objects in the same E.g. the value of st_dev may be different for two objects in the same
overlay filesystem and the value of st_ino for directory objects may not be overlay filesystem and the value of st_ino for directory objects may not be
persistent and could change even while the overlay filesystem is mounted. persistent and could change even while the overlay filesystem is mounted, as
summarized in the `Inode properties`_ table above.
Changes to underlying filesystems Changes to underlying filesystems
......
...@@ -36,6 +36,13 @@ static int ovl_ccup_get(char *buf, const struct kernel_param *param) ...@@ -36,6 +36,13 @@ static int ovl_ccup_get(char *buf, const struct kernel_param *param)
module_param_call(check_copy_up, ovl_ccup_set, ovl_ccup_get, NULL, 0644); module_param_call(check_copy_up, ovl_ccup_set, ovl_ccup_get, NULL, 0644);
MODULE_PARM_DESC(check_copy_up, "Obsolete; does nothing"); MODULE_PARM_DESC(check_copy_up, "Obsolete; does nothing");
static bool ovl_must_copy_xattr(const char *name)
{
return !strcmp(name, XATTR_POSIX_ACL_ACCESS) ||
!strcmp(name, XATTR_POSIX_ACL_DEFAULT) ||
!strncmp(name, XATTR_SECURITY_PREFIX, XATTR_SECURITY_PREFIX_LEN);
}
int ovl_copy_xattr(struct dentry *old, struct dentry *new) int ovl_copy_xattr(struct dentry *old, struct dentry *new)
{ {
ssize_t list_size, size, value_size = 0; ssize_t list_size, size, value_size = 0;
...@@ -107,8 +114,13 @@ int ovl_copy_xattr(struct dentry *old, struct dentry *new) ...@@ -107,8 +114,13 @@ int ovl_copy_xattr(struct dentry *old, struct dentry *new)
continue; /* Discard */ continue; /* Discard */
} }
error = vfs_setxattr(new, name, value, size, 0); error = vfs_setxattr(new, name, value, size, 0);
if (error) if (error) {
break; if (error != -EOPNOTSUPP || ovl_must_copy_xattr(name))
break;
/* Ignore failure to copy unknown xattrs */
error = 0;
}
} }
kfree(value); kfree(value);
out: out:
......
...@@ -42,7 +42,7 @@ int ovl_cleanup(struct inode *wdir, struct dentry *wdentry) ...@@ -42,7 +42,7 @@ int ovl_cleanup(struct inode *wdir, struct dentry *wdentry)
return err; return err;
} }
static struct dentry *ovl_lookup_temp(struct dentry *workdir) struct dentry *ovl_lookup_temp(struct dentry *workdir)
{ {
struct dentry *temp; struct dentry *temp;
char name[20]; char name[20];
...@@ -243,6 +243,9 @@ static int ovl_instantiate(struct dentry *dentry, struct inode *inode, ...@@ -243,6 +243,9 @@ static int ovl_instantiate(struct dentry *dentry, struct inode *inode,
ovl_dir_modified(dentry->d_parent, false); ovl_dir_modified(dentry->d_parent, false);
ovl_dentry_set_upper_alias(dentry); ovl_dentry_set_upper_alias(dentry);
ovl_dentry_update_reval(dentry, newdentry,
DCACHE_OP_REVALIDATE | DCACHE_OP_WEAK_REVALIDATE);
if (!hardlink) { if (!hardlink) {
/* /*
* ovl_obtain_alias() can be called after ovl_create_real() * ovl_obtain_alias() can be called after ovl_create_real()
...@@ -819,6 +822,28 @@ static bool ovl_pure_upper(struct dentry *dentry) ...@@ -819,6 +822,28 @@ static bool ovl_pure_upper(struct dentry *dentry)
!ovl_test_flag(OVL_WHITEOUTS, d_inode(dentry)); !ovl_test_flag(OVL_WHITEOUTS, d_inode(dentry));
} }
static void ovl_drop_nlink(struct dentry *dentry)
{
struct inode *inode = d_inode(dentry);
struct dentry *alias;
/* Try to find another, hashed alias */
spin_lock(&inode->i_lock);
hlist_for_each_entry(alias, &inode->i_dentry, d_u.d_alias) {
if (alias != dentry && !d_unhashed(alias))
break;
}
spin_unlock(&inode->i_lock);
/*
* Changes to underlying layers may cause i_nlink to lose sync with
* reality. In this case prevent the link count from going to zero
* prematurely.
*/
if (inode->i_nlink > !!alias)
drop_nlink(inode);
}
static int ovl_do_remove(struct dentry *dentry, bool is_dir) static int ovl_do_remove(struct dentry *dentry, bool is_dir)
{ {
int err; int err;
...@@ -856,7 +881,7 @@ static int ovl_do_remove(struct dentry *dentry, bool is_dir) ...@@ -856,7 +881,7 @@ static int ovl_do_remove(struct dentry *dentry, bool is_dir)
if (is_dir) if (is_dir)
clear_nlink(dentry->d_inode); clear_nlink(dentry->d_inode);
else else
drop_nlink(dentry->d_inode); ovl_drop_nlink(dentry);
} }
ovl_nlink_end(dentry); ovl_nlink_end(dentry);
...@@ -1201,7 +1226,7 @@ static int ovl_rename(struct inode *olddir, struct dentry *old, ...@@ -1201,7 +1226,7 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
if (new_is_dir) if (new_is_dir)
clear_nlink(d_inode(new)); clear_nlink(d_inode(new));
else else
drop_nlink(d_inode(new)); ovl_drop_nlink(new);
} }
ovl_dir_modified(old->d_parent, ovl_type_origin(old) || ovl_dir_modified(old->d_parent, ovl_type_origin(old) ||
......
...@@ -308,29 +308,35 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb, ...@@ -308,29 +308,35 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
ovl_set_flag(OVL_UPPERDATA, inode); ovl_set_flag(OVL_UPPERDATA, inode);
dentry = d_find_any_alias(inode); dentry = d_find_any_alias(inode);
if (!dentry) { if (dentry)
dentry = d_alloc_anon(inode->i_sb); goto out_iput;
if (!dentry)
goto nomem; dentry = d_alloc_anon(inode->i_sb);
oe = ovl_alloc_entry(lower ? 1 : 0); if (unlikely(!dentry))
if (!oe) goto nomem;
goto nomem; oe = ovl_alloc_entry(lower ? 1 : 0);
if (!oe)
if (lower) { goto nomem;
oe->lowerstack->dentry = dget(lower);
oe->lowerstack->layer = lowerpath->layer; if (lower) {
} oe->lowerstack->dentry = dget(lower);
dentry->d_fsdata = oe; oe->lowerstack->layer = lowerpath->layer;
if (upper_alias)
ovl_dentry_set_upper_alias(dentry);
} }
dentry->d_fsdata = oe;
if (upper_alias)
ovl_dentry_set_upper_alias(dentry);
ovl_dentry_update_reval(dentry, upper,
DCACHE_OP_REVALIDATE | DCACHE_OP_WEAK_REVALIDATE);
return d_instantiate_anon(dentry, inode); return d_instantiate_anon(dentry, inode);
nomem: nomem:
iput(inode);
dput(dentry); dput(dentry);
return ERR_PTR(-ENOMEM); dentry = ERR_PTR(-ENOMEM);
out_iput:
iput(inode);
return dentry;
} }
/* Get the upper or lower dentry in stach whose on layer @idx */ /* Get the upper or lower dentry in stach whose on layer @idx */
......
...@@ -79,6 +79,7 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat, int fsid) ...@@ -79,6 +79,7 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat, int fsid)
{ {
bool samefs = ovl_same_fs(dentry->d_sb); bool samefs = ovl_same_fs(dentry->d_sb);
unsigned int xinobits = ovl_xino_bits(dentry->d_sb); unsigned int xinobits = ovl_xino_bits(dentry->d_sb);
unsigned int xinoshift = 64 - xinobits;
if (samefs) { if (samefs) {
/* /*
...@@ -89,22 +90,22 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat, int fsid) ...@@ -89,22 +90,22 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat, int fsid)
stat->dev = dentry->d_sb->s_dev; stat->dev = dentry->d_sb->s_dev;
return 0; return 0;
} else if (xinobits) { } else if (xinobits) {
unsigned int shift = 64 - xinobits;
/* /*
* All inode numbers of underlying fs should not be using the * All inode numbers of underlying fs should not be using the
* high xinobits, so we use high xinobits to partition the * high xinobits, so we use high xinobits to partition the
* overlay st_ino address space. The high bits holds the fsid * overlay st_ino address space. The high bits holds the fsid
* (upper fsid is 0). This way overlay inode numbers are unique * (upper fsid is 0). The lowest xinobit is reserved for mapping
* and all inodes use overlay st_dev. Inode numbers are also * the non-peresistent inode numbers range in case of overflow.
* persistent for a given layer configuration. * This way all overlay inode numbers are unique and use the
* overlay st_dev.
*/ */
if (stat->ino >> shift) { if (likely(!(stat->ino >> xinoshift))) {
pr_warn_ratelimited("inode number too big (%pd2, ino=%llu, xinobits=%d)\n", stat->ino |= ((u64)fsid) << (xinoshift + 1);
dentry, stat->ino, xinobits);
} else {
stat->ino |= ((u64)fsid) << shift;
stat->dev = dentry->d_sb->s_dev; stat->dev = dentry->d_sb->s_dev;
return 0; return 0;
} else if (ovl_xino_warn(dentry->d_sb)) {
pr_warn_ratelimited("inode number too big (%pd2, ino=%llu, xinobits=%d)\n",
dentry, stat->ino, xinobits);
} }
} }
...@@ -504,7 +505,7 @@ static const struct address_space_operations ovl_aops = { ...@@ -504,7 +505,7 @@ static const struct address_space_operations ovl_aops = {
/* /*
* It is possible to stack overlayfs instance on top of another * It is possible to stack overlayfs instance on top of another
* overlayfs instance as lower layer. We need to annonate the * overlayfs instance as lower layer. We need to annotate the
* stackable i_mutex locks according to stack level of the super * stackable i_mutex locks according to stack level of the super
* block instance. An overlayfs instance can never be in stack * block instance. An overlayfs instance can never be in stack
* depth 0 (there is always a real fs below it). An overlayfs * depth 0 (there is always a real fs below it). An overlayfs
...@@ -561,27 +562,73 @@ static inline void ovl_lockdep_annotate_inode_mutex_key(struct inode *inode) ...@@ -561,27 +562,73 @@ static inline void ovl_lockdep_annotate_inode_mutex_key(struct inode *inode)
#endif #endif
} }
static void ovl_fill_inode(struct inode *inode, umode_t mode, dev_t rdev, static void ovl_next_ino(struct inode *inode)
unsigned long ino, int fsid) {
struct ovl_fs *ofs = inode->i_sb->s_fs_info;
inode->i_ino = atomic_long_inc_return(&ofs->last_ino);
if (unlikely(!inode->i_ino))
inode->i_ino = atomic_long_inc_return(&ofs->last_ino);
}
static void ovl_map_ino(struct inode *inode, unsigned long ino, int fsid)
{ {
int xinobits = ovl_xino_bits(inode->i_sb); int xinobits = ovl_xino_bits(inode->i_sb);
unsigned int xinoshift = 64 - xinobits;
/* /*
* When d_ino is consistent with st_ino (samefs or i_ino has enough * When d_ino is consistent with st_ino (samefs or i_ino has enough
* bits to encode layer), set the same value used for st_ino to i_ino, * bits to encode layer), set the same value used for st_ino to i_ino,
* so inode number exposed via /proc/locks and a like will be * so inode number exposed via /proc/locks and a like will be
* consistent with d_ino and st_ino values. An i_ino value inconsistent * consistent with d_ino and st_ino values. An i_ino value inconsistent
* with d_ino also causes nfsd readdirplus to fail. When called from * with d_ino also causes nfsd readdirplus to fail.
* ovl_new_inode(), ino arg is 0, so i_ino will be updated to real
* upper inode i_ino on ovl_inode_init() or ovl_inode_update().
*/ */
if (ovl_same_dev(inode->i_sb)) { inode->i_ino = ino;
inode->i_ino = ino; if (ovl_same_fs(inode->i_sb)) {
if (xinobits && fsid && !(ino >> (64 - xinobits))) return;
inode->i_ino |= (unsigned long)fsid << (64 - xinobits); } else if (xinobits && likely(!(ino >> xinoshift))) {
} else { inode->i_ino |= (unsigned long)fsid << (xinoshift + 1);
inode->i_ino = get_next_ino(); return;
}
/*
* For directory inodes on non-samefs with xino disabled or xino
* overflow, we allocate a non-persistent inode number, to be used for
* resolving st_ino collisions in ovl_map_dev_ino().
*
* To avoid ino collision with legitimate xino values from upper
* layer (fsid 0), use the lowest xinobit to map the non
* persistent inode numbers to the unified st_ino address space.
*/
if (S_ISDIR(inode->i_mode)) {
ovl_next_ino(inode);
if (xinobits) {
inode->i_ino &= ~0UL >> xinobits;
inode->i_ino |= 1UL << xinoshift;
}
} }
}
void ovl_inode_init(struct inode *inode, struct ovl_inode_params *oip,
unsigned long ino, int fsid)
{
struct inode *realinode;
if (oip->upperdentry)
OVL_I(inode)->__upperdentry = oip->upperdentry;
if (oip->lowerpath && oip->lowerpath->dentry)
OVL_I(inode)->lower = igrab(d_inode(oip->lowerpath->dentry));
if (oip->lowerdata)
OVL_I(inode)->lowerdata = igrab(d_inode(oip->lowerdata));
realinode = ovl_inode_real(inode);
ovl_copyattr(realinode, inode);
ovl_copyflags(realinode, inode);
ovl_map_ino(inode, ino, fsid);
}
static void ovl_fill_inode(struct inode *inode, umode_t mode, dev_t rdev)
{
inode->i_mode = mode; inode->i_mode = mode;
inode->i_flags |= S_NOCMTIME; inode->i_flags |= S_NOCMTIME;
#ifdef CONFIG_FS_POSIX_ACL #ifdef CONFIG_FS_POSIX_ACL
...@@ -719,7 +766,7 @@ struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev) ...@@ -719,7 +766,7 @@ struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev)
inode = new_inode(sb); inode = new_inode(sb);
if (inode) if (inode)
ovl_fill_inode(inode, mode, rdev, 0, 0); ovl_fill_inode(inode, mode, rdev);
return inode; return inode;
} }
...@@ -891,7 +938,7 @@ struct inode *ovl_get_inode(struct super_block *sb, ...@@ -891,7 +938,7 @@ struct inode *ovl_get_inode(struct super_block *sb,
struct dentry *lowerdentry = lowerpath ? lowerpath->dentry : NULL; struct dentry *lowerdentry = lowerpath ? lowerpath->dentry : NULL;
bool bylower = ovl_hash_bylower(sb, upperdentry, lowerdentry, bool bylower = ovl_hash_bylower(sb, upperdentry, lowerdentry,
oip->index); oip->index);
int fsid = bylower ? oip->lowerpath->layer->fsid : 0; int fsid = bylower ? lowerpath->layer->fsid : 0;
bool is_dir, metacopy = false; bool is_dir, metacopy = false;
unsigned long ino = 0; unsigned long ino = 0;
int err = oip->newinode ? -EEXIST : -ENOMEM; int err = oip->newinode ? -EEXIST : -ENOMEM;
...@@ -941,9 +988,11 @@ struct inode *ovl_get_inode(struct super_block *sb, ...@@ -941,9 +988,11 @@ struct inode *ovl_get_inode(struct super_block *sb,
err = -ENOMEM; err = -ENOMEM;
goto out_err; goto out_err;
} }
ino = realinode->i_ino;
fsid = lowerpath->layer->fsid;
} }
ovl_fill_inode(inode, realinode->i_mode, realinode->i_rdev, ino, fsid); ovl_fill_inode(inode, realinode->i_mode, realinode->i_rdev);
ovl_inode_init(inode, upperdentry, lowerdentry, oip->lowerdata); ovl_inode_init(inode, oip, ino, fsid);
if (upperdentry && ovl_is_impuredir(upperdentry)) if (upperdentry && ovl_is_impuredir(upperdentry))
ovl_set_flag(OVL_IMPURE, inode); ovl_set_flag(OVL_IMPURE, inode);
......
...@@ -845,7 +845,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry, ...@@ -845,7 +845,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
if (err) if (err)
goto out; goto out;
if (upperdentry && unlikely(ovl_dentry_remote(upperdentry))) { if (upperdentry && upperdentry->d_flags & DCACHE_OP_REAL) {
dput(upperdentry); dput(upperdentry);
err = -EREMOTE; err = -EREMOTE;
goto out; goto out;
...@@ -1076,6 +1076,9 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry, ...@@ -1076,6 +1076,9 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
goto out_free_oe; goto out_free_oe;
} }
ovl_dentry_update_reval(dentry, upperdentry,
DCACHE_OP_REVALIDATE | DCACHE_OP_WEAK_REVALIDATE);
revert_creds(old_cred); revert_creds(old_cred);
if (origin_path) { if (origin_path) {
dput(origin_path->dentry); dput(origin_path->dentry);
......
...@@ -48,6 +48,12 @@ enum ovl_entry_flag { ...@@ -48,6 +48,12 @@ enum ovl_entry_flag {
OVL_E_CONNECTED, OVL_E_CONNECTED,
}; };
enum {
OVL_XINO_OFF,
OVL_XINO_AUTO,
OVL_XINO_ON,
};
/* /*
* The tuple (fh,uuid) is a universal unique identifier for a copy up origin, * The tuple (fh,uuid) is a universal unique identifier for a copy up origin,
* where: * where:
...@@ -87,7 +93,7 @@ struct ovl_fb { ...@@ -87,7 +93,7 @@ struct ovl_fb {
u8 flags; /* OVL_FH_FLAG_* */ u8 flags; /* OVL_FH_FLAG_* */
u8 type; /* fid_type of fid */ u8 type; /* fid_type of fid */
uuid_t uuid; /* uuid of filesystem */ uuid_t uuid; /* uuid of filesystem */
u32 fid[0]; /* file identifier should be 32bit aligned in-memory */ u32 fid[]; /* file identifier should be 32bit aligned in-memory */
} __packed; } __packed;
/* In-memory and on-wire format for overlay file handle */ /* In-memory and on-wire format for overlay file handle */
...@@ -230,6 +236,8 @@ bool ovl_index_all(struct super_block *sb); ...@@ -230,6 +236,8 @@ bool ovl_index_all(struct super_block *sb);
bool ovl_verify_lower(struct super_block *sb); bool ovl_verify_lower(struct super_block *sb);
struct ovl_entry *ovl_alloc_entry(unsigned int numlower); struct ovl_entry *ovl_alloc_entry(unsigned int numlower);
bool ovl_dentry_remote(struct dentry *dentry); bool ovl_dentry_remote(struct dentry *dentry);
void ovl_dentry_update_reval(struct dentry *dentry, struct dentry *upperdentry,
unsigned int mask);
bool ovl_dentry_weird(struct dentry *dentry); bool ovl_dentry_weird(struct dentry *dentry);
enum ovl_path_type ovl_path_type(struct dentry *dentry); enum ovl_path_type ovl_path_type(struct dentry *dentry);
void ovl_path_upper(struct dentry *dentry, struct path *path); void ovl_path_upper(struct dentry *dentry, struct path *path);
...@@ -264,8 +272,6 @@ void ovl_set_upperdata(struct inode *inode); ...@@ -264,8 +272,6 @@ void ovl_set_upperdata(struct inode *inode);
bool ovl_redirect_dir(struct super_block *sb); bool ovl_redirect_dir(struct super_block *sb);
const char *ovl_dentry_get_redirect(struct dentry *dentry); const char *ovl_dentry_get_redirect(struct dentry *dentry);
void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect); void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect);
void ovl_inode_init(struct inode *inode, struct dentry *upperdentry,
struct dentry *lowerdentry, struct dentry *lowerdata);
void ovl_inode_update(struct inode *inode, struct dentry *upperdentry); void ovl_inode_update(struct inode *inode, struct dentry *upperdentry);
void ovl_dir_modified(struct dentry *dentry, bool impurity); void ovl_dir_modified(struct dentry *dentry, bool impurity);
u64 ovl_dentry_version_get(struct dentry *dentry); u64 ovl_dentry_version_get(struct dentry *dentry);
...@@ -301,6 +307,16 @@ static inline bool ovl_is_impuredir(struct dentry *dentry) ...@@ -301,6 +307,16 @@ static inline bool ovl_is_impuredir(struct dentry *dentry)
return ovl_check_dir_xattr(dentry, OVL_XATTR_IMPURE); return ovl_check_dir_xattr(dentry, OVL_XATTR_IMPURE);
} }
/*
* With xino=auto, we do best effort to keep all inodes on same st_dev and
* d_ino consistent with st_ino.
* With xino=on, we do the same effort but we warn if we failed.
*/
static inline bool ovl_xino_warn(struct super_block *sb)
{
return OVL_FS(sb)->config.xino == OVL_XINO_ON;
}
/* All layers on same fs? */ /* All layers on same fs? */
static inline bool ovl_same_fs(struct super_block *sb) static inline bool ovl_same_fs(struct super_block *sb)
{ {
...@@ -410,6 +426,8 @@ struct ovl_inode_params { ...@@ -410,6 +426,8 @@ struct ovl_inode_params {
char *redirect; char *redirect;
struct dentry *lowerdata; struct dentry *lowerdata;
}; };
void ovl_inode_init(struct inode *inode, struct ovl_inode_params *oip,
unsigned long ino, int fsid);
struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev); struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev);
struct inode *ovl_lookup_inode(struct super_block *sb, struct dentry *real, struct inode *ovl_lookup_inode(struct super_block *sb, struct dentry *real,
bool is_upper); bool is_upper);
...@@ -451,6 +469,7 @@ struct ovl_cattr { ...@@ -451,6 +469,7 @@ struct ovl_cattr {
struct dentry *ovl_create_real(struct inode *dir, struct dentry *newdentry, struct dentry *ovl_create_real(struct inode *dir, struct dentry *newdentry,
struct ovl_cattr *attr); struct ovl_cattr *attr);
int ovl_cleanup(struct inode *dir, struct dentry *dentry); int ovl_cleanup(struct inode *dir, struct dentry *dentry);
struct dentry *ovl_lookup_temp(struct dentry *workdir);
struct dentry *ovl_create_temp(struct dentry *workdir, struct ovl_cattr *attr); struct dentry *ovl_create_temp(struct dentry *workdir, struct ovl_cattr *attr);
/* file.c */ /* file.c */
......
...@@ -75,6 +75,8 @@ struct ovl_fs { ...@@ -75,6 +75,8 @@ struct ovl_fs {
struct inode *indexdir_trap; struct inode *indexdir_trap;
/* -1: disabled, 0: same fs, 1..32: number of unused ino bits */ /* -1: disabled, 0: same fs, 1..32: number of unused ino bits */
int xino_mode; int xino_mode;
/* For allocation of non-persistent inode numbers */
atomic_long_t last_ino;
}; };
static inline struct ovl_fs *OVL_FS(struct super_block *sb) static inline struct ovl_fs *OVL_FS(struct super_block *sb)
......
...@@ -438,15 +438,23 @@ static struct ovl_dir_cache *ovl_cache_get(struct dentry *dentry) ...@@ -438,15 +438,23 @@ static struct ovl_dir_cache *ovl_cache_get(struct dentry *dentry)
/* Map inode number to lower fs unique range */ /* Map inode number to lower fs unique range */
static u64 ovl_remap_lower_ino(u64 ino, int xinobits, int fsid, static u64 ovl_remap_lower_ino(u64 ino, int xinobits, int fsid,
const char *name, int namelen) const char *name, int namelen, bool warn)
{ {
if (ino >> (64 - xinobits)) { unsigned int xinoshift = 64 - xinobits;
pr_warn_ratelimited("d_ino too big (%.*s, ino=%llu, xinobits=%d)\n",
namelen, name, ino, xinobits); if (unlikely(ino >> xinoshift)) {
if (warn) {
pr_warn_ratelimited("d_ino too big (%.*s, ino=%llu, xinobits=%d)\n",
namelen, name, ino, xinobits);
}
return ino; return ino;
} }
return ino | ((u64)fsid) << (64 - xinobits); /*
* The lowest xinobit is reserved for mapping the non-peresistent inode
* numbers range, but this range is only exposed via st_ino, not here.
*/
return ino | ((u64)fsid) << (xinoshift + 1);
} }
/* /*
...@@ -515,7 +523,8 @@ static int ovl_cache_update_ino(struct path *path, struct ovl_cache_entry *p) ...@@ -515,7 +523,8 @@ static int ovl_cache_update_ino(struct path *path, struct ovl_cache_entry *p)
} else if (xinobits && !OVL_TYPE_UPPER(type)) { } else if (xinobits && !OVL_TYPE_UPPER(type)) {
ino = ovl_remap_lower_ino(ino, xinobits, ino = ovl_remap_lower_ino(ino, xinobits,
ovl_layer_lower(this)->fsid, ovl_layer_lower(this)->fsid,
p->name, p->len); p->name, p->len,
ovl_xino_warn(dir->d_sb));
} }
out: out:
...@@ -645,6 +654,7 @@ struct ovl_readdir_translate { ...@@ -645,6 +654,7 @@ struct ovl_readdir_translate {
u64 parent_ino; u64 parent_ino;
int fsid; int fsid;
int xinobits; int xinobits;
bool xinowarn;
}; };
static int ovl_fill_real(struct dir_context *ctx, const char *name, static int ovl_fill_real(struct dir_context *ctx, const char *name,
...@@ -665,7 +675,7 @@ static int ovl_fill_real(struct dir_context *ctx, const char *name, ...@@ -665,7 +675,7 @@ static int ovl_fill_real(struct dir_context *ctx, const char *name,
ino = p->ino; ino = p->ino;
} else if (rdt->xinobits) { } else if (rdt->xinobits) {
ino = ovl_remap_lower_ino(ino, rdt->xinobits, rdt->fsid, ino = ovl_remap_lower_ino(ino, rdt->xinobits, rdt->fsid,
name, namelen); name, namelen, rdt->xinowarn);
} }
return orig_ctx->actor(orig_ctx, name, namelen, offset, ino, d_type); return orig_ctx->actor(orig_ctx, name, namelen, offset, ino, d_type);
...@@ -696,6 +706,7 @@ static int ovl_iterate_real(struct file *file, struct dir_context *ctx) ...@@ -696,6 +706,7 @@ static int ovl_iterate_real(struct file *file, struct dir_context *ctx)
.ctx.actor = ovl_fill_real, .ctx.actor = ovl_fill_real,
.orig_ctx = ctx, .orig_ctx = ctx,
.xinobits = ovl_xino_bits(dir->d_sb), .xinobits = ovl_xino_bits(dir->d_sb),
.xinowarn = ovl_xino_warn(dir->d_sb),
}; };
if (rdt.xinobits && lower_layer) if (rdt.xinobits && lower_layer)
......
This diff is collapsed.
...@@ -93,8 +93,24 @@ struct ovl_entry *ovl_alloc_entry(unsigned int numlower) ...@@ -93,8 +93,24 @@ struct ovl_entry *ovl_alloc_entry(unsigned int numlower)
bool ovl_dentry_remote(struct dentry *dentry) bool ovl_dentry_remote(struct dentry *dentry)
{ {
return dentry->d_flags & return dentry->d_flags &
(DCACHE_OP_REVALIDATE | DCACHE_OP_WEAK_REVALIDATE | (DCACHE_OP_REVALIDATE | DCACHE_OP_WEAK_REVALIDATE);
DCACHE_OP_REAL); }
void ovl_dentry_update_reval(struct dentry *dentry, struct dentry *upperdentry,
unsigned int mask)
{
struct ovl_entry *oe = OVL_E(dentry);
unsigned int i, flags = 0;
if (upperdentry)
flags |= upperdentry->d_flags;
for (i = 0; i < oe->numlower; i++)
flags |= oe->lowerstack[i].dentry->d_flags;
spin_lock(&dentry->d_lock);
dentry->d_flags &= ~mask;
dentry->d_flags |= flags & mask;
spin_unlock(&dentry->d_lock);
} }
bool ovl_dentry_weird(struct dentry *dentry) bool ovl_dentry_weird(struct dentry *dentry)
...@@ -386,24 +402,6 @@ void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect) ...@@ -386,24 +402,6 @@ void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect)
oi->redirect = redirect; oi->redirect = redirect;
} }
void ovl_inode_init(struct inode *inode, struct dentry *upperdentry,
struct dentry *lowerdentry, struct dentry *lowerdata)
{
struct inode *realinode = d_inode(upperdentry ?: lowerdentry);
if (upperdentry)
OVL_I(inode)->__upperdentry = upperdentry;
if (lowerdentry)
OVL_I(inode)->lower = igrab(d_inode(lowerdentry));
if (lowerdata)
OVL_I(inode)->lowerdata = igrab(d_inode(lowerdata));
ovl_copyattr(realinode, inode);
ovl_copyflags(realinode, inode);
if (!inode->i_ino)
inode->i_ino = realinode->i_ino;
}
void ovl_inode_update(struct inode *inode, struct dentry *upperdentry) void ovl_inode_update(struct inode *inode, struct dentry *upperdentry)
{ {
struct inode *upperinode = d_inode(upperdentry); struct inode *upperinode = d_inode(upperdentry);
...@@ -416,8 +414,6 @@ void ovl_inode_update(struct inode *inode, struct dentry *upperdentry) ...@@ -416,8 +414,6 @@ void ovl_inode_update(struct inode *inode, struct dentry *upperdentry)
smp_wmb(); smp_wmb();
OVL_I(inode)->__upperdentry = upperdentry; OVL_I(inode)->__upperdentry = upperdentry;
if (inode_unhashed(inode)) { if (inode_unhashed(inode)) {
if (!inode->i_ino)
inode->i_ino = upperinode->i_ino;
inode->i_private = upperinode; inode->i_private = upperinode;
__insert_inode_hash(inode, (unsigned long) upperinode); __insert_inode_hash(inode, (unsigned long) upperinode);
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment