eventfs: Remove eventfs_file and just use eventfs_inode

Instead of having a descriptor for every file represented in the eventfs
directory, only have the directory itself represented. Change the API to
send in a list of entries that represent all the files in the directory
(but not other directories). The entry list contains a name and a callback
function that will be used to create the files when they are accessed.

struct eventfs_inode *eventfs_create_events_dir(const char *name, struct dentry *parent,
						const struct eventfs_entry *entries,
						int size, void *data);

is used for the top level eventfs directory, and returns an eventfs_inode
that will be used by:

struct eventfs_inode *eventfs_create_dir(const char *name, struct eventfs_inode *parent,
					 const struct eventfs_entry *entries,
					 int size, void *data);

where both of the above take an array of struct eventfs_entry entries for
every file that is in the directory.

The entries are defined by:

typedef int (*eventfs_callback)(const char *name, umode_t *mode, void **data,
				const struct file_operations **fops);

struct eventfs_entry {
	const char			*name;
	eventfs_callback		callback;
};

Where the name is the name of the file and the callback gets called when
the file is being created. The callback passes in the name (in case the
same callback is used for multiple files), a pointer to the mode, data and
fops. The data will be pointing to the data that was passed in
eventfs_create_dir() or eventfs_create_events_dir() but may be overridden
to point to something else, as it will be used to point to the
inode->i_private that is created. The information passed back from the
callback is used to create the dentry/inode.

If the callback fills the data and the file should be created, it must
return a positive number. On zero or negative, the file is ignored.

This logic may also be used as a prototype to convert entire pseudo file
systems into just-in-time allocation.

The "show_events_dentry" file has been updated to show the directories,
and any files they have.

With just the eventfs_file allocations:

 Before after deltas for meminfo (in kB):

   MemFree:		-14360
   MemAvailable:	-14260
   Buffers:		40
   Cached:		24
   Active:		44
   Inactive:		48
   Inactive(anon):	28
   Active(file):	44
   Inactive(file):	20
   Dirty:		-4
   AnonPages:		28
   Mapped:		4
   KReclaimable:	132
   Slab:		1604
   SReclaimable:	132
   SUnreclaim:		1472
   Committed_AS:	12

 Before after deltas for slabinfo:

   <slab>:		<objects>	[ * <size> = <total>]

   ext4_inode_cache	27		[* 1184 = 31968 ]
   extent_status	102		[*   40 = 4080 ]
   tracefs_inode_cache	144		[*  656 = 94464 ]
   buffer_head		39		[*  104 = 4056 ]
   shmem_inode_cache	49		[*  800 = 39200 ]
   filp			-53		[*  256 = -13568 ]
   dentry		251		[*  192 = 48192 ]
   lsm_file_cache	277		[*   32 = 8864 ]
   vm_area_struct	-14		[*  184 = -2576 ]
   trace_event_file	1748		[*   88 = 153824 ]
   kmalloc-1k		35		[* 1024 = 35840 ]
   kmalloc-256		49		[*  256 = 12544 ]
   kmalloc-192		-28		[*  192 = -5376 ]
   kmalloc-128		-30		[*  128 = -3840 ]
   kmalloc-96		10581		[*   96 = 1015776 ]
   kmalloc-64		3056		[*   64 = 195584 ]
   kmalloc-32		1291		[*   32 = 41312 ]
   kmalloc-16		2310		[*   16 = 36960 ]
   kmalloc-8		9216		[*    8 = 73728 ]

 Free memory dropped by 14,360 kB
 Available memory dropped by 14,260 kB
 Total slab additions in size: 1,771,032 bytes

With this change:

 Before after deltas for meminfo (in kB):

   MemFree:		-12084
   MemAvailable:	-11976
   Buffers:		32
   Cached:		32
   Active:		72
   Inactive:		168
   Inactive(anon):	176
   Active(file):	72
   Inactive(file):	-8
   Dirty:		24
   AnonPages:		196
   Mapped:		8
   KReclaimable:	148
   Slab:		836
   SReclaimable:	148
   SUnreclaim:		688
   Committed_AS:	324

 Before after deltas for slabinfo:

   <slab>:		<objects>	[ * <size> = <total>]

   tracefs_inode_cache	144		[* 656 = 94464 ]
   shmem_inode_cache	-23		[* 800 = -18400 ]
   filp			-92		[* 256 = -23552 ]
   dentry		179		[* 192 = 34368 ]
   lsm_file_cache	-3		[* 32 = -96 ]
   vm_area_struct	-13		[* 184 = -2392 ]
   trace_event_file	1748		[* 88 = 153824 ]
   kmalloc-1k		-49		[* 1024 = -50176 ]
   kmalloc-256		-27		[* 256 = -6912 ]
   kmalloc-128		1864		[* 128 = 238592 ]
   kmalloc-64		4685		[* 64 = 299840 ]
   kmalloc-32		-72		[* 32 = -2304 ]
   kmalloc-16		256		[* 16 = 4096 ]
   total = 721352

 Free memory dropped by 12,084 kB
 Available memory dropped by 11,976 kB
 Total slab additions in size:  721,352 bytes

That's over 2 MB in savings per instance for free and available memory,
and over 1 MB in savings per instance of slab memory.

Link: https://lore.kernel.org/linux-trace-kernel/20231003184059.4924468e@gandalf.local.home
Link: https://lore.kernel.org/linux-trace-kernel/20231004165007.43d79161@gandalf.local.home

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ajay Kaher <akaher@vmware.com>
Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
parent 2c6d0950
...@@ -2,8 +2,9 @@ ...@@ -2,8 +2,9 @@
/* /*
* event_inode.c - part of tracefs, a pseudo file system for activating tracing * event_inode.c - part of tracefs, a pseudo file system for activating tracing
* *
* Copyright (C) 2020-23 VMware Inc, author: Steven Rostedt (VMware) <rostedt@goodmis.org> * Copyright (C) 2020-23 VMware Inc, author: Steven Rostedt <rostedt@goodmis.org>
* Copyright (C) 2020-23 VMware Inc, author: Ajay Kaher <akaher@vmware.com> * Copyright (C) 2020-23 VMware Inc, author: Ajay Kaher <akaher@vmware.com>
* Copyright (C) 2023 Google, author: Steven Rostedt <rostedt@goodmis.org>
* *
* eventfs is used to dynamically create inodes and dentries based on the * eventfs is used to dynamically create inodes and dentries based on the
* meta data provided by the tracing system. * meta data provided by the tracing system.
...@@ -23,46 +24,6 @@ ...@@ -23,46 +24,6 @@
#include <linux/delay.h> #include <linux/delay.h>
#include "internal.h" #include "internal.h"
struct eventfs_inode {
struct list_head e_top_files;
};
/*
* struct eventfs_file - hold the properties of the eventfs files and
* directories.
* @name: the name of the file or directory to create
* @d_parent: holds parent's dentry
* @dentry: once accessed holds dentry
* @list: file or directory to be added to parent directory
* @ei: list of files and directories within directory
* @fop: file_operations for file or directory
* @iop: inode_operations for file or directory
* @data: something that the caller will want to get to later on
* @mode: the permission that the file or directory should have
*/
struct eventfs_file {
const char *name;
struct dentry *d_parent;
struct dentry *dentry;
struct list_head list;
struct eventfs_inode *ei;
const struct file_operations *fop;
const struct inode_operations *iop;
/*
* Union - used for deletion
* @del_list: list of eventfs_file to delete
* @rcu: eventfs_file to delete in RCU
* @is_freed: node is freed if one of the above is set
*/
union {
struct list_head del_list;
struct rcu_head rcu;
unsigned long is_freed;
};
void *data;
umode_t mode;
};
static DEFINE_MUTEX(eventfs_mutex); static DEFINE_MUTEX(eventfs_mutex);
DEFINE_STATIC_SRCU(eventfs_srcu); DEFINE_STATIC_SRCU(eventfs_srcu);
...@@ -93,16 +54,9 @@ static const struct file_operations eventfs_file_operations = { ...@@ -93,16 +54,9 @@ static const struct file_operations eventfs_file_operations = {
* @data: something that the caller will want to get to later on. * @data: something that the caller will want to get to later on.
* @fop: struct file_operations that should be used for this file. * @fop: struct file_operations that should be used for this file.
* *
* This is the basic "create a file" function for tracefs. It allows for a * This function creates a dentry that represents a file in the eventsfs_inode
* wide range of flexibility in creating a file. * directory. The inode.i_private pointer will point to @data in the open()
* * call.
* This function will return a pointer to a dentry if it succeeds. This
* pointer must be passed to the tracefs_remove() function when the file is
* to be removed (no automatic cleanup happens if your module is unloaded,
* you are responsible here.) If an error occurs, %NULL will be returned.
*
* If tracefs is not enabled in the kernel, the value -%ENODEV will be
* returned.
*/ */
static struct dentry *create_file(const char *name, umode_t mode, static struct dentry *create_file(const char *name, umode_t mode,
struct dentry *parent, void *data, struct dentry *parent, void *data,
...@@ -118,6 +72,7 @@ static struct dentry *create_file(const char *name, umode_t mode, ...@@ -118,6 +72,7 @@ static struct dentry *create_file(const char *name, umode_t mode,
if (WARN_ON_ONCE(!S_ISREG(mode))) if (WARN_ON_ONCE(!S_ISREG(mode)))
return NULL; return NULL;
WARN_ON_ONCE(!parent);
dentry = eventfs_start_creating(name, parent); dentry = eventfs_start_creating(name, parent);
if (IS_ERR(dentry)) if (IS_ERR(dentry))
...@@ -142,20 +97,11 @@ static struct dentry *create_file(const char *name, umode_t mode, ...@@ -142,20 +97,11 @@ static struct dentry *create_file(const char *name, umode_t mode,
* create_dir - create a dir in the tracefs filesystem * create_dir - create a dir in the tracefs filesystem
* @name: the name of the file to create. * @name: the name of the file to create.
* @parent: parent dentry for this file. * @parent: parent dentry for this file.
* @data: something that the caller will want to get to later on.
*
* This is the basic "create a dir" function for eventfs. It allows for a
* wide range of flexibility in creating a dir.
*
* This function will return a pointer to a dentry if it succeeds. This
* pointer must be passed to the tracefs_remove() function when the file is
* to be removed (no automatic cleanup happens if your module is unloaded,
* you are responsible here.) If an error occurs, %NULL will be returned.
* *
* If tracefs is not enabled in the kernel, the value -%ENODEV will be * This function will create a dentry for a directory represented by
* returned. * a eventfs_inode.
*/ */
static struct dentry *create_dir(const char *name, struct dentry *parent, void *data) static struct dentry *create_dir(const char *name, struct dentry *parent)
{ {
struct tracefs_inode *ti; struct tracefs_inode *ti;
struct dentry *dentry; struct dentry *dentry;
...@@ -172,7 +118,6 @@ static struct dentry *create_dir(const char *name, struct dentry *parent, void * ...@@ -172,7 +118,6 @@ static struct dentry *create_dir(const char *name, struct dentry *parent, void *
inode->i_mode = S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO; inode->i_mode = S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO;
inode->i_op = &eventfs_root_dir_inode_operations; inode->i_op = &eventfs_root_dir_inode_operations;
inode->i_fop = &eventfs_file_operations; inode->i_fop = &eventfs_file_operations;
inode->i_private = data;
ti = get_tracefs(inode); ti = get_tracefs(inode);
ti->flags |= TRACEFS_EVENT_INODE; ti->flags |= TRACEFS_EVENT_INODE;
...@@ -185,18 +130,18 @@ static struct dentry *create_dir(const char *name, struct dentry *parent, void * ...@@ -185,18 +130,18 @@ static struct dentry *create_dir(const char *name, struct dentry *parent, void *
} }
/** /**
* eventfs_set_ef_status_free - set the ef->status to free * eventfs_set_ei_status_free - remove the dentry reference from an eventfs_inode
* @ti: the tracefs_inode of the dentry * @ti: the tracefs_inode of the dentry
* @dentry: dentry who's status to be freed * @dentry: dentry which has the reference to remove.
* *
* eventfs_set_ef_status_free will be called if no more * Remove the association between a dentry from an eventfs_inode.
* references remain
*/ */
void eventfs_set_ef_status_free(struct tracefs_inode *ti, struct dentry *dentry) void eventfs_set_ei_status_free(struct tracefs_inode *ti, struct dentry *dentry)
{ {
struct tracefs_inode *ti_parent; struct tracefs_inode *ti_parent;
struct eventfs_inode *ei_child, *tmp;
struct eventfs_inode *ei; struct eventfs_inode *ei;
struct eventfs_file *ef, *tmp; int i;
/* The top level events directory may be freed by this */ /* The top level events directory may be freed by this */
if (unlikely(ti->flags & TRACEFS_EVENT_TOP_INODE)) { if (unlikely(ti->flags & TRACEFS_EVENT_TOP_INODE)) {
...@@ -207,9 +152,9 @@ void eventfs_set_ef_status_free(struct tracefs_inode *ti, struct dentry *dentry) ...@@ -207,9 +152,9 @@ void eventfs_set_ef_status_free(struct tracefs_inode *ti, struct dentry *dentry)
ei = ti->private; ei = ti->private;
/* Record all the top level files */ /* Record all the top level files */
list_for_each_entry_srcu(ef, &ei->e_top_files, list, list_for_each_entry_srcu(ei_child, &ei->children, list,
lockdep_is_held(&eventfs_mutex)) { lockdep_is_held(&eventfs_mutex)) {
list_add_tail(&ef->del_list, &ef_del_list); list_add_tail(&ei_child->del_list, &ef_del_list);
} }
/* Nothing should access this, but just in case! */ /* Nothing should access this, but just in case! */
...@@ -218,11 +163,13 @@ void eventfs_set_ef_status_free(struct tracefs_inode *ti, struct dentry *dentry) ...@@ -218,11 +163,13 @@ void eventfs_set_ef_status_free(struct tracefs_inode *ti, struct dentry *dentry)
mutex_unlock(&eventfs_mutex); mutex_unlock(&eventfs_mutex);
/* Now safely free the top level files and their children */ /* Now safely free the top level files and their children */
list_for_each_entry_safe(ef, tmp, &ef_del_list, del_list) { list_for_each_entry_safe(ei_child, tmp, &ef_del_list, del_list) {
list_del(&ef->del_list); list_del(&ei_child->del_list);
eventfs_remove(ef); eventfs_remove_dir(ei_child);
} }
kfree_const(ei->name);
kfree(ei->d_children);
kfree(ei); kfree(ei);
return; return;
} }
...@@ -233,68 +180,162 @@ void eventfs_set_ef_status_free(struct tracefs_inode *ti, struct dentry *dentry) ...@@ -233,68 +180,162 @@ void eventfs_set_ef_status_free(struct tracefs_inode *ti, struct dentry *dentry)
if (!ti_parent || !(ti_parent->flags & TRACEFS_EVENT_INODE)) if (!ti_parent || !(ti_parent->flags & TRACEFS_EVENT_INODE))
goto out; goto out;
ef = dentry->d_fsdata; ei = dentry->d_fsdata;
if (!ef) if (!ei)
goto out; goto out;
/* /*
* If ef was freed, then the LSB bit is set for d_fsdata. * If ei was freed, then the LSB bit is set for d_fsdata.
* But this should not happen, as it should still have a * But this should not happen, as it should still have a
* ref count that prevents it. Warn in case it does. * ref count that prevents it. Warn in case it does.
*/ */
if (WARN_ON_ONCE((unsigned long)ef & 1)) if (WARN_ON_ONCE((unsigned long)ei & 1))
goto out; goto out;
/* This could belong to one of the files of the ei */
if (ei->dentry != dentry) {
for (i = 0; i < ei->nr_entries; i++) {
if (ei->d_children[i] == dentry)
break;
}
if (WARN_ON_ONCE(i == ei->nr_entries))
goto out;
ei->d_children[i] = NULL;
} else {
ei->dentry = NULL;
}
dentry->d_fsdata = NULL; dentry->d_fsdata = NULL;
ef->dentry = NULL; out:
out:
mutex_unlock(&eventfs_mutex); mutex_unlock(&eventfs_mutex);
} }
/**
* create_file_dentry - create a dentry for a file of an eventfs_inode
* @ei: the eventfs_inode that the file will be created under
* @e_dentry: a pointer to the d_children[] of the @ei
* @parent: The parent dentry of the created file.
* @name: The name of the file to create
* @mode: The mode of the file.
* @data: The data to use to set the inode of the file with on open()
* @fops: The fops of the file to be created.
* @lookup: If called by the lookup routine, in which case, dput() the created dentry.
*
* Create a dentry for a file of an eventfs_inode @ei and place it into the
* address located at @e_dentry. If the @e_dentry already has a dentry, then
* just do a dget() on it and return. Otherwise create the dentry and attach it.
*/
static struct dentry *
create_file_dentry(struct eventfs_inode *ei, struct dentry **e_dentry,
struct dentry *parent, const char *name, umode_t mode, void *data,
const struct file_operations *fops, bool lookup)
{
struct dentry *dentry;
bool invalidate = false;
mutex_lock(&eventfs_mutex);
/* If the e_dentry already has a dentry, use it */
if (*e_dentry) {
/* lookup does not need to up the ref count */
if (!lookup)
dget(*e_dentry);
mutex_unlock(&eventfs_mutex);
return *e_dentry;
}
mutex_unlock(&eventfs_mutex);
/* The lookup already has the parent->d_inode locked */
if (!lookup)
inode_lock(parent->d_inode);
dentry = create_file(name, mode, parent, data, fops);
if (!lookup)
inode_unlock(parent->d_inode);
mutex_lock(&eventfs_mutex);
if (IS_ERR_OR_NULL(dentry)) {
/*
* When the mutex was released, something else could have
* created the dentry for this e_dentry. In which case
* use that one.
*
* Note, with the mutex held, the e_dentry cannot have content
* and the ei->is_freed be true at the same time.
*/
WARN_ON_ONCE(ei->is_freed);
dentry = *e_dentry;
/* The lookup does not need to up the dentry refcount */
if (dentry && !lookup)
dget(dentry);
mutex_unlock(&eventfs_mutex);
return dentry;
}
if (!*e_dentry && !ei->is_freed) {
*e_dentry = dentry;
dentry->d_fsdata = ei;
} else {
/*
* Should never happen unless we get here due to being freed.
* Otherwise it means two dentries exist with the same name.
*/
WARN_ON_ONCE(!ei->is_freed);
invalidate = true;
}
mutex_unlock(&eventfs_mutex);
if (invalidate)
d_invalidate(dentry);
if (lookup || invalidate)
dput(dentry);
return invalidate ? NULL : dentry;
}
/** /**
* eventfs_post_create_dir - post create dir routine * eventfs_post_create_dir - post create dir routine
* @ef: eventfs_file of recently created dir * @ei: eventfs_inode of recently created dir
* *
* Map the meta-data of files within an eventfs dir to their parent dentry * Map the meta-data of files within an eventfs dir to their parent dentry
*/ */
static void eventfs_post_create_dir(struct eventfs_file *ef) static void eventfs_post_create_dir(struct eventfs_inode *ei)
{ {
struct eventfs_file *ef_child; struct eventfs_inode *ei_child;
struct tracefs_inode *ti; struct tracefs_inode *ti;
/* srcu lock already held */ /* srcu lock already held */
/* fill parent-child relation */ /* fill parent-child relation */
list_for_each_entry_srcu(ef_child, &ef->ei->e_top_files, list, list_for_each_entry_srcu(ei_child, &ei->children, list,
srcu_read_lock_held(&eventfs_srcu)) { srcu_read_lock_held(&eventfs_srcu)) {
ef_child->d_parent = ef->dentry; ei_child->d_parent = ei->dentry;
} }
ti = get_tracefs(ef->dentry->d_inode); ti = get_tracefs(ei->dentry->d_inode);
ti->private = ef->ei; ti->private = ei;
} }
/** /**
* create_dentry - helper function to create dentry * create_dir_dentry - Create a directory dentry for the eventfs_inode
* @ef: eventfs_file of file or directory to create * @ei: The eventfs_inode to create the directory for
* @parent: parent dentry * @parent: The dentry of the parent of this directory
* @lookup: true if called from lookup routine * @lookup: True if this is called by the lookup code
* *
* Used to create a dentry for file/dir, executes post dentry creation routine * This creates and attaches a directory dentry to the eventfs_inode @ei.
*/ */
static struct dentry * static struct dentry *
create_dentry(struct eventfs_file *ef, struct dentry *parent, bool lookup) create_dir_dentry(struct eventfs_inode *ei, struct dentry *parent, bool lookup)
{ {
bool invalidate = false; bool invalidate = false;
struct dentry *dentry; struct dentry *dentry = NULL;
mutex_lock(&eventfs_mutex); mutex_lock(&eventfs_mutex);
if (ef->is_freed) { if (ei->dentry) {
mutex_unlock(&eventfs_mutex); /* If the dentry already has a dentry, use it */
return NULL; dentry = ei->dentry;
} /* lookup does not need to up the ref count */
if (ef->dentry) {
dentry = ef->dentry;
/* On dir open, up the ref count */
if (!lookup) if (!lookup)
dget(dentry); dget(dentry);
mutex_unlock(&eventfs_mutex); mutex_unlock(&eventfs_mutex);
...@@ -302,42 +343,44 @@ create_dentry(struct eventfs_file *ef, struct dentry *parent, bool lookup) ...@@ -302,42 +343,44 @@ create_dentry(struct eventfs_file *ef, struct dentry *parent, bool lookup)
} }
mutex_unlock(&eventfs_mutex); mutex_unlock(&eventfs_mutex);
/* The lookup already has the parent->d_inode locked */
if (!lookup) if (!lookup)
inode_lock(parent->d_inode); inode_lock(parent->d_inode);
if (ef->ei) dentry = create_dir(ei->name, parent);
dentry = create_dir(ef->name, parent, ef->data);
else
dentry = create_file(ef->name, ef->mode, parent,
ef->data, ef->fop);
if (!lookup) if (!lookup)
inode_unlock(parent->d_inode); inode_unlock(parent->d_inode);
mutex_lock(&eventfs_mutex); mutex_lock(&eventfs_mutex);
if (IS_ERR_OR_NULL(dentry)) {
/* If the ef was already updated get it */ if (IS_ERR_OR_NULL(dentry) && !ei->is_freed) {
dentry = ef->dentry; /*
* When the mutex was released, something else could have
* created the dentry for this e_dentry. In which case
* use that one.
*
* Note, with the mutex held, the e_dentry cannot have content
* and the ei->is_freed be true at the same time.
*/
dentry = ei->dentry;
if (dentry && !lookup) if (dentry && !lookup)
dget(dentry); dget(dentry);
mutex_unlock(&eventfs_mutex); mutex_unlock(&eventfs_mutex);
return dentry; return dentry;
} }
if (!ef->dentry && !ef->is_freed) { if (!ei->dentry && !ei->is_freed) {
ef->dentry = dentry; ei->dentry = dentry;
if (ef->ei) eventfs_post_create_dir(ei);
eventfs_post_create_dir(ef); dentry->d_fsdata = ei;
dentry->d_fsdata = ef;
} else { } else {
/* A race here, should try again (unless freed) */
invalidate = true;
/* /*
* Should never happen unless we get here due to being freed. * Should never happen unless we get here due to being freed.
* Otherwise it means two dentries exist with the same name. * Otherwise it means two dentries exist with the same name.
*/ */
WARN_ON_ONCE(!ef->is_freed); WARN_ON_ONCE(!ei->is_freed);
invalidate = true;
} }
mutex_unlock(&eventfs_mutex); mutex_unlock(&eventfs_mutex);
if (invalidate) if (invalidate)
...@@ -349,50 +392,85 @@ create_dentry(struct eventfs_file *ef, struct dentry *parent, bool lookup) ...@@ -349,50 +392,85 @@ create_dentry(struct eventfs_file *ef, struct dentry *parent, bool lookup)
return invalidate ? NULL : dentry; return invalidate ? NULL : dentry;
} }
static bool match_event_file(struct eventfs_file *ef, const char *name)
{
bool ret;
mutex_lock(&eventfs_mutex);
ret = !ef->is_freed && strcmp(ef->name, name) == 0;
mutex_unlock(&eventfs_mutex);
return ret;
}
/** /**
* eventfs_root_lookup - lookup routine to create file/dir * eventfs_root_lookup - lookup routine to create file/dir
* @dir: in which a lookup is being done * @dir: in which a lookup is being done
* @dentry: file/dir dentry * @dentry: file/dir dentry
* @flags: to pass as flags parameter to simple lookup * @flags: Just passed to simple_lookup()
* *
* Used to create a dynamic file/dir within @dir. Use the eventfs_inode * Used to create dynamic file/dir with-in @dir, search with-in @ei
* list of meta data to find the information needed to create the file/dir. * list, if @dentry found go ahead and create the file/dir
*/ */
static struct dentry *eventfs_root_lookup(struct inode *dir, static struct dentry *eventfs_root_lookup(struct inode *dir,
struct dentry *dentry, struct dentry *dentry,
unsigned int flags) unsigned int flags)
{ {
const struct file_operations *fops;
const struct eventfs_entry *entry;
struct eventfs_inode *ei_child;
struct tracefs_inode *ti; struct tracefs_inode *ti;
struct eventfs_inode *ei; struct eventfs_inode *ei;
struct eventfs_file *ef; struct dentry *ei_dentry = NULL;
struct dentry *ret = NULL; struct dentry *ret = NULL;
const char *name = dentry->d_name.name;
bool created = false;
umode_t mode;
void *data;
int idx; int idx;
int i;
int r;
ti = get_tracefs(dir); ti = get_tracefs(dir);
if (!(ti->flags & TRACEFS_EVENT_INODE)) if (!(ti->flags & TRACEFS_EVENT_INODE))
return NULL; return NULL;
ei = ti->private; /* Grab srcu to prevent the ei from going away */
idx = srcu_read_lock(&eventfs_srcu); idx = srcu_read_lock(&eventfs_srcu);
list_for_each_entry_srcu(ef, &ei->e_top_files, list,
/*
* Grab the eventfs_mutex to consistent value from ti->private.
* This s
*/
mutex_lock(&eventfs_mutex);
ei = READ_ONCE(ti->private);
if (ei)
ei_dentry = READ_ONCE(ei->dentry);
mutex_unlock(&eventfs_mutex);
if (!ei || !ei_dentry)
goto out;
data = ei->data;
list_for_each_entry_srcu(ei_child, &ei->children, list,
srcu_read_lock_held(&eventfs_srcu)) { srcu_read_lock_held(&eventfs_srcu)) {
if (!match_event_file(ef, dentry->d_name.name)) if (strcmp(ei_child->name, name) != 0)
continue; continue;
ret = simple_lookup(dir, dentry, flags); ret = simple_lookup(dir, dentry, flags);
create_dentry(ef, ef->d_parent, true); create_dir_dentry(ei_child, ei_dentry, true);
created = true;
break; break;
} }
if (created)
goto out;
for (i = 0; i < ei->nr_entries; i++) {
entry = &ei->entries[i];
if (strcmp(name, entry->name) == 0) {
void *cdata = data;
r = entry->callback(name, &mode, &cdata, &fops);
if (r <= 0)
continue;
ret = simple_lookup(dir, dentry, flags);
create_file_dentry(ei, &ei->d_children[i],
ei_dentry, name, mode, cdata,
fops, true);
break;
}
}
out:
srcu_read_unlock(&eventfs_srcu, idx); srcu_read_unlock(&eventfs_srcu, idx);
return ret; return ret;
} }
...@@ -432,29 +510,48 @@ static int eventfs_release(struct inode *inode, struct file *file) ...@@ -432,29 +510,48 @@ static int eventfs_release(struct inode *inode, struct file *file)
return dcache_dir_close(inode, file); return dcache_dir_close(inode, file);
} }
static int add_dentries(struct dentry ***dentries, struct dentry *d, int cnt)
{
struct dentry **tmp;
tmp = krealloc(*dentries, sizeof(d) * (cnt + 2), GFP_KERNEL);
if (!tmp)
return -1;
tmp[cnt] = d;
tmp[cnt + 1] = NULL;
*dentries = tmp;
return 0;
}
/** /**
* dcache_dir_open_wrapper - eventfs open wrapper * dcache_dir_open_wrapper - eventfs open wrapper
* @inode: not used * @inode: not used
* @file: dir to be opened (to create its child) * @file: dir to be opened (to create it's children)
* *
* Used to dynamically create the file/dir within @file. @file is really a * Used to dynamic create file/dir with-in @file, all the
* directory and all the files/dirs of the children within @file will be * file/dir will be created. If already created then references
* created. If any of the files/dirs have already been created, their * will be increased
* reference count will be incremented.
*/ */
static int dcache_dir_open_wrapper(struct inode *inode, struct file *file) static int dcache_dir_open_wrapper(struct inode *inode, struct file *file)
{ {
const struct file_operations *fops;
const struct eventfs_entry *entry;
struct eventfs_inode *ei_child;
struct tracefs_inode *ti; struct tracefs_inode *ti;
struct eventfs_inode *ei; struct eventfs_inode *ei;
struct eventfs_file *ef;
struct dentry_list *dlist; struct dentry_list *dlist;
struct dentry **dentries = NULL; struct dentry **dentries = NULL;
struct dentry *dentry = file_dentry(file); struct dentry *parent = file_dentry(file);
struct dentry *d; struct dentry *d;
struct inode *f_inode = file_inode(file); struct inode *f_inode = file_inode(file);
const char *name = parent->d_name.name;
umode_t mode;
void *data;
int cnt = 0; int cnt = 0;
int idx; int idx;
int ret; int ret;
int i;
int r;
ti = get_tracefs(f_inode); ti = get_tracefs(f_inode);
if (!(ti->flags & TRACEFS_EVENT_INODE)) if (!(ti->flags & TRACEFS_EVENT_INODE))
...@@ -463,25 +560,51 @@ static int dcache_dir_open_wrapper(struct inode *inode, struct file *file) ...@@ -463,25 +560,51 @@ static int dcache_dir_open_wrapper(struct inode *inode, struct file *file)
if (WARN_ON_ONCE(file->private_data)) if (WARN_ON_ONCE(file->private_data))
return -EINVAL; return -EINVAL;
idx = srcu_read_lock(&eventfs_srcu);
mutex_lock(&eventfs_mutex);
ei = READ_ONCE(ti->private);
mutex_unlock(&eventfs_mutex);
if (!ei) {
srcu_read_unlock(&eventfs_srcu, idx);
return -EINVAL;
}
data = ei->data;
dlist = kmalloc(sizeof(*dlist), GFP_KERNEL); dlist = kmalloc(sizeof(*dlist), GFP_KERNEL);
if (!dlist) if (!dlist) {
srcu_read_unlock(&eventfs_srcu, idx);
return -ENOMEM; return -ENOMEM;
}
ei = ti->private; list_for_each_entry_srcu(ei_child, &ei->children, list,
idx = srcu_read_lock(&eventfs_srcu);
list_for_each_entry_srcu(ef, &ei->e_top_files, list,
srcu_read_lock_held(&eventfs_srcu)) { srcu_read_lock_held(&eventfs_srcu)) {
d = create_dentry(ef, dentry, false); d = create_dir_dentry(ei_child, parent, false);
if (d) { if (d) {
struct dentry **tmp; ret = add_dentries(&dentries, d, cnt);
if (ret < 0)
break;
cnt++;
}
}
tmp = krealloc(dentries, sizeof(d) * (cnt + 2), GFP_KERNEL); for (i = 0; i < ei->nr_entries; i++) {
if (!tmp) void *cdata = data;
entry = &ei->entries[i];
name = entry->name;
r = entry->callback(name, &mode, &cdata, &fops);
if (r <= 0)
continue;
d = create_file_dentry(ei, &ei->d_children[i],
parent, name, mode, cdata, fops, false);
if (d) {
ret = add_dentries(&dentries, d, cnt);
if (ret < 0)
break; break;
tmp[cnt] = d;
tmp[cnt + 1] = NULL;
cnt++; cnt++;
dentries = tmp;
} }
} }
srcu_read_unlock(&eventfs_srcu, idx); srcu_read_unlock(&eventfs_srcu, idx);
...@@ -514,63 +637,90 @@ static int dcache_readdir_wrapper(struct file *file, struct dir_context *ctx) ...@@ -514,63 +637,90 @@ static int dcache_readdir_wrapper(struct file *file, struct dir_context *ctx)
} }
/** /**
* eventfs_prepare_ef - helper function to prepare eventfs_file * eventfs_create_dir - Create the eventfs_inode for this directory
* @name: the name of the file/directory to create. * @name: The name of the directory to create.
* @mode: the permission that the file should have. * @parent: The eventfs_inode of the parent directory.
* @fop: struct file_operations that should be used for this file/directory. * @entries: A list of entries that represent the files under this directory
* @iop: struct inode_operations that should be used for this file/directory. * @size: The number of @entries
* @data: something that the caller will want to get to later on. The * @data: The default data to pass to the files (an entry may override it).
* inode.i_private pointer will point to this value on the open() call. *
* This function creates the descriptor to represent a directory in the
* eventfs. This descriptor is an eventfs_inode, and it is returned to be
* used to create other children underneath.
*
* The @entries is an array of eventfs_entry structures which has:
* const char *name
* eventfs_callback callback;
*
* The name is the name of the file, and the callback is a pointer to a function
* that will be called when the file is reference (either by lookup or by
* reading a directory). The callback is of the prototype:
* *
* This function allocates and fills the eventfs_file structure. * int callback(const char *name, umode_t *mode, void **data,
* const struct file_operations **fops);
*
* When a file needs to be created, this callback will be called with
* name = the name of the file being created (so that the same callback
* may be used for multiple files).
* mode = a place to set the file's mode
* data = A pointer to @data, and the callback may replace it, which will
* cause the file created to pass the new data to the open() call.
* fops = the fops to use for the created file.
*/ */
static struct eventfs_file *eventfs_prepare_ef(const char *name, umode_t mode, struct eventfs_inode *eventfs_create_dir(const char *name, struct eventfs_inode *parent,
const struct file_operations *fop, const struct eventfs_entry *entries,
const struct inode_operations *iop, int size, void *data)
void *data)
{ {
struct eventfs_file *ef; struct eventfs_inode *ei;
ef = kzalloc(sizeof(*ef), GFP_KERNEL); if (!parent)
if (!ef) return ERR_PTR(-EINVAL);
ei = kzalloc(sizeof(*ei), GFP_KERNEL);
if (!ei)
return ERR_PTR(-ENOMEM); return ERR_PTR(-ENOMEM);
ef->name = kstrdup(name, GFP_KERNEL); ei->name = kstrdup_const(name, GFP_KERNEL);
if (!ef->name) { if (!ei->name) {
kfree(ef); kfree(ei);
return ERR_PTR(-ENOMEM); return ERR_PTR(-ENOMEM);
} }
if (S_ISDIR(mode)) { if (size) {
ef->ei = kzalloc(sizeof(*ef->ei), GFP_KERNEL); ei->d_children = kzalloc(sizeof(*ei->d_children) * size, GFP_KERNEL);
if (!ef->ei) { if (!ei->d_children) {
kfree(ef->name); kfree_const(ei->name);
kfree(ef); kfree(ei);
return ERR_PTR(-ENOMEM); return ERR_PTR(-ENOMEM);
} }
INIT_LIST_HEAD(&ef->ei->e_top_files);
} else {
ef->ei = NULL;
} }
ef->iop = iop; ei->entries = entries;
ef->fop = fop; ei->nr_entries = size;
ef->mode = mode; ei->data = data;
ef->data = data; INIT_LIST_HEAD(&ei->children);
return ef;
mutex_lock(&eventfs_mutex);
list_add_tail(&ei->list, &parent->children);
ei->d_parent = parent->dentry;
mutex_unlock(&eventfs_mutex);
return ei;
} }
/** /**
* eventfs_create_events_dir - create the trace event structure * eventfs_create_events_dir - create the top level events directory
* @name: the name of the directory to create. * @name: The name of the top level directory to create.
* @parent: parent dentry for this file. This should be a directory dentry * @parent: Parent dentry for this file in the tracefs directory.
* if set. If this parameter is NULL, then the directory will be * @entries: A list of entries that represent the files under this directory
* created in the root of the tracefs filesystem. * @size: The number of @entries
* @data: The default data to pass to the files (an entry may override it).
* *
* This function creates the top of the trace event directory. * This function creates the top of the trace event directory.
*/ */
struct dentry *eventfs_create_events_dir(const char *name, struct eventfs_inode *eventfs_create_events_dir(const char *name, struct dentry *parent,
struct dentry *parent) const struct eventfs_entry *entries,
int size, void *data)
{ {
struct dentry *dentry = tracefs_start_creating(name, parent); struct dentry *dentry = tracefs_start_creating(name, parent);
struct eventfs_inode *ei; struct eventfs_inode *ei;
...@@ -581,19 +731,32 @@ struct dentry *eventfs_create_events_dir(const char *name, ...@@ -581,19 +731,32 @@ struct dentry *eventfs_create_events_dir(const char *name,
return NULL; return NULL;
if (IS_ERR(dentry)) if (IS_ERR(dentry))
return dentry; return (struct eventfs_inode *)dentry;
ei = kzalloc(sizeof(*ei), GFP_KERNEL); ei = kzalloc(sizeof(*ei), GFP_KERNEL);
if (!ei) if (!ei)
return ERR_PTR(-ENOMEM); goto fail;
inode = tracefs_get_inode(dentry->d_sb); inode = tracefs_get_inode(dentry->d_sb);
if (unlikely(!inode)) { if (unlikely(!inode))
kfree(ei); goto fail;
tracefs_failed_creating(dentry);
return ERR_PTR(-ENOMEM); if (size) {
ei->d_children = kzalloc(sizeof(*ei->d_children) * size, GFP_KERNEL);
if (!ei->d_children)
goto fail;
} }
INIT_LIST_HEAD(&ei->e_top_files); ei->dentry = dentry;
ei->entries = entries;
ei->nr_entries = size;
ei->data = data;
ei->name = kstrdup_const(name, GFP_KERNEL);
if (!ei->name)
goto fail;
INIT_LIST_HEAD(&ei->children);
INIT_LIST_HEAD(&ei->list);
ti = get_tracefs(inode); ti = get_tracefs(inode);
ti->flags |= TRACEFS_EVENT_INODE | TRACEFS_EVENT_TOP_INODE; ti->flags |= TRACEFS_EVENT_INODE | TRACEFS_EVENT_TOP_INODE;
...@@ -608,193 +771,41 @@ struct dentry *eventfs_create_events_dir(const char *name, ...@@ -608,193 +771,41 @@ struct dentry *eventfs_create_events_dir(const char *name,
d_instantiate(dentry, inode); d_instantiate(dentry, inode);
inc_nlink(dentry->d_parent->d_inode); inc_nlink(dentry->d_parent->d_inode);
fsnotify_mkdir(dentry->d_parent->d_inode, dentry); fsnotify_mkdir(dentry->d_parent->d_inode, dentry);
return tracefs_end_creating(dentry); tracefs_end_creating(dentry);
}
/** /* Will call dput when the directory is removed */
* eventfs_add_subsystem_dir - add eventfs subsystem_dir to list to create later dget(dentry);
* @name: the name of the file to create.
* @parent: parent dentry for this dir.
*
* This function adds eventfs subsystem dir to list.
* And all these dirs are created on the fly when they are looked up,
* and the dentry and inodes will be removed when they are done.
*/
struct eventfs_file *eventfs_add_subsystem_dir(const char *name,
struct dentry *parent)
{
struct tracefs_inode *ti_parent;
struct eventfs_inode *ei_parent;
struct eventfs_file *ef;
if (security_locked_down(LOCKDOWN_TRACEFS)) return ei;
return NULL;
if (!parent)
return ERR_PTR(-EINVAL);
ti_parent = get_tracefs(parent->d_inode);
ei_parent = ti_parent->private;
ef = eventfs_prepare_ef(name, S_IFDIR, NULL, NULL, NULL); fail:
if (IS_ERR(ef)) kfree(ei->d_children);
return ef; kfree(ei);
tracefs_failed_creating(dentry);
mutex_lock(&eventfs_mutex); return ERR_PTR(-ENOMEM);
list_add_tail(&ef->list, &ei_parent->e_top_files);
ef->d_parent = parent;
mutex_unlock(&eventfs_mutex);
return ef;
} }
/** static void free_ei(struct rcu_head *head)
* eventfs_add_dir - add eventfs dir to list to create later
* @name: the name of the file to create.
* @ef_parent: parent eventfs_file for this dir.
*
* This function adds eventfs dir to list.
* And all these dirs are created on the fly when they are looked up,
* and the dentry and inodes will be removed when they are done.
*/
struct eventfs_file *eventfs_add_dir(const char *name,
struct eventfs_file *ef_parent)
{ {
struct eventfs_file *ef; struct eventfs_inode *ei = container_of(head, struct eventfs_inode, rcu);
if (security_locked_down(LOCKDOWN_TRACEFS))
return NULL;
if (!ef_parent) kfree_const(ei->name);
return ERR_PTR(-EINVAL); kfree(ei->d_children);
kfree(ei);
ef = eventfs_prepare_ef(name, S_IFDIR, NULL, NULL, NULL);
if (IS_ERR(ef))
return ef;
mutex_lock(&eventfs_mutex);
list_add_tail(&ef->list, &ef_parent->ei->e_top_files);
ef->d_parent = ef_parent->dentry;
mutex_unlock(&eventfs_mutex);
return ef;
}
/**
* eventfs_add_events_file - add the data needed to create a file for later reference
* @name: the name of the file to create.
* @mode: the permission that the file should have.
* @parent: parent dentry for this file.
* @data: something that the caller will want to get to later on.
* @fop: struct file_operations that should be used for this file.
*
* This function is used to add the information needed to create a
* dentry/inode within the top level events directory. The file created
* will have the @mode permissions. The @data will be used to fill the
* inode.i_private when the open() call is done. The dentry and inodes are
* all created when they are referenced, and removed when they are no
* longer referenced.
*/
int eventfs_add_events_file(const char *name, umode_t mode,
struct dentry *parent, void *data,
const struct file_operations *fop)
{
struct tracefs_inode *ti;
struct eventfs_inode *ei;
struct eventfs_file *ef;
if (security_locked_down(LOCKDOWN_TRACEFS))
return -ENODEV;
if (!parent)
return -EINVAL;
if (!(mode & S_IFMT))
mode |= S_IFREG;
if (!parent->d_inode)
return -EINVAL;
ti = get_tracefs(parent->d_inode);
if (!(ti->flags & TRACEFS_EVENT_INODE))
return -EINVAL;
ei = ti->private;
ef = eventfs_prepare_ef(name, mode, fop, NULL, data);
if (IS_ERR(ef))
return -ENOMEM;
mutex_lock(&eventfs_mutex);
list_add_tail(&ef->list, &ei->e_top_files);
ef->d_parent = parent;
mutex_unlock(&eventfs_mutex);
return 0;
}
/**
* eventfs_add_file - add eventfs file to list to create later
* @name: the name of the file to create.
* @mode: the permission that the file should have.
* @ef_parent: parent eventfs_file for this file.
* @data: something that the caller will want to get to later on.
* @fop: struct file_operations that should be used for this file.
*
* This function is used to add the information needed to create a
* file within a subdirectory of the events directory. The file created
* will have the @mode permissions. The @data will be used to fill the
* inode.i_private when the open() call is done. The dentry and inodes are
* all created when they are referenced, and removed when they are no
* longer referenced.
*/
int eventfs_add_file(const char *name, umode_t mode,
struct eventfs_file *ef_parent,
void *data,
const struct file_operations *fop)
{
struct eventfs_file *ef;
if (security_locked_down(LOCKDOWN_TRACEFS))
return -ENODEV;
if (!ef_parent)
return -EINVAL;
if (!(mode & S_IFMT))
mode |= S_IFREG;
ef = eventfs_prepare_ef(name, mode, fop, NULL, data);
if (IS_ERR(ef))
return -ENOMEM;
mutex_lock(&eventfs_mutex);
list_add_tail(&ef->list, &ef_parent->ei->e_top_files);
ef->d_parent = ef_parent->dentry;
mutex_unlock(&eventfs_mutex);
return 0;
}
static void free_ef(struct rcu_head *head)
{
struct eventfs_file *ef = container_of(head, struct eventfs_file, rcu);
kfree(ef->name);
kfree(ef->ei);
kfree(ef);
} }
/** /**
* eventfs_remove_rec - remove eventfs dir or file from list * eventfs_remove_rec - remove eventfs dir or file from list
* @ef: eventfs_file to be removed. * @ei: eventfs_inode to be removed.
* @head: to create list of eventfs_file to be deleted
* @level: to check recursion depth
* *
* The helper function eventfs_remove_rec() is used to clean up and free the * This function recursively remove eventfs_inode which
* associated data from eventfs for both of the added functions. * contains info of file or dir.
*/ */
static void eventfs_remove_rec(struct eventfs_file *ef, struct list_head *head, int level) static void eventfs_remove_rec(struct eventfs_inode *ei, struct list_head *head, int level)
{ {
struct eventfs_file *ef_child; struct eventfs_inode *ei_child;
if (!ef) if (!ei)
return; return;
/* /*
* Check recursion depth. It should never be greater than 3: * Check recursion depth. It should never be greater than 3:
...@@ -806,62 +817,68 @@ static void eventfs_remove_rec(struct eventfs_file *ef, struct list_head *head, ...@@ -806,62 +817,68 @@ static void eventfs_remove_rec(struct eventfs_file *ef, struct list_head *head,
if (WARN_ON_ONCE(level > 3)) if (WARN_ON_ONCE(level > 3))
return; return;
if (ef->ei) { /* search for nested folders or files */
/* search for nested folders or files */ list_for_each_entry_srcu(ei_child, &ei->children, list,
list_for_each_entry_srcu(ef_child, &ef->ei->e_top_files, list, lockdep_is_held(&eventfs_mutex)) {
lockdep_is_held(&eventfs_mutex)) { eventfs_remove_rec(ei_child, head, level + 1);
eventfs_remove_rec(ef_child, head, level + 1);
}
} }
list_del_rcu(&ef->list); list_del_rcu(&ei->list);
list_add_tail(&ef->del_list, head); list_add_tail(&ei->del_list, head);
} }
static void unhook_dentry(struct dentry **dentry, struct dentry **list)
{
if (*dentry) {
unsigned long ptr = (unsigned long)*list;
/* Keep the dentry from being freed yet */
dget(*dentry);
/*
* Paranoid: The dget() above should prevent the dentry
* from being freed and calling eventfs_set_ei_status_free().
* But just in case, set the link list LSB pointer to 1
* and have eventfs_set_ei_status_free() check that to
* make sure that if it does happen, it will not think
* the d_fsdata is an eventfs_inode.
*
* For this to work, no eventfs_inode should be allocated
* on a odd space, as the ef should always be allocated
* to be at least word aligned. Check for that too.
*/
WARN_ON_ONCE(ptr & 1);
(*dentry)->d_fsdata = (void *)(ptr | 1);
*list = *dentry;
*dentry = NULL;
}
}
/** /**
* eventfs_remove - remove eventfs dir or file from list * eventfs_remove - remove eventfs dir or file from list
* @ef: eventfs_file to be removed. * @ei: eventfs_inode to be removed.
* *
* This function acquire the eventfs_mutex lock and call eventfs_remove_rec() * This function acquire the eventfs_mutex lock and call eventfs_remove_rec()
*/ */
void eventfs_remove(struct eventfs_file *ef) void eventfs_remove_dir(struct eventfs_inode *ei)
{ {
struct eventfs_file *tmp; struct eventfs_inode *tmp;
LIST_HEAD(ef_del_list); LIST_HEAD(ei_del_list);
struct dentry *dentry_list = NULL; struct dentry *dentry_list = NULL;
struct dentry *dentry; struct dentry *dentry;
int i;
if (!ef) if (!ei)
return; return;
mutex_lock(&eventfs_mutex); mutex_lock(&eventfs_mutex);
eventfs_remove_rec(ef, &ef_del_list, 0); eventfs_remove_rec(ei, &ei_del_list, 0);
list_for_each_entry_safe(ef, tmp, &ef_del_list, del_list) {
if (ef->dentry) { list_for_each_entry_safe(ei, tmp, &ei_del_list, del_list) {
unsigned long ptr = (unsigned long)dentry_list; for (i = 0; i < ei->nr_entries; i++)
unhook_dentry(&ei->d_children[i], &dentry_list);
/* Keep the dentry from being freed yet */ unhook_dentry(&ei->dentry, &dentry_list);
dget(ef->dentry); call_srcu(&eventfs_srcu, &ei->rcu, free_ei);
/*
* Paranoid: The dget() above should prevent the dentry
* from being freed and calling eventfs_set_ef_status_free().
* But just in case, set the link list LSB pointer to 1
* and have eventfs_set_ef_status_free() check that to
* make sure that if it does happen, it will not think
* the d_fsdata is an event_file.
*
* For this to work, no event_file should be allocated
* on a odd space, as the ef should always be allocated
* to be at least word aligned. Check for that too.
*/
WARN_ON_ONCE(ptr & 1);
ef->dentry->d_fsdata = (void *)(ptr | 1);
dentry_list = ef->dentry;
ef->dentry = NULL;
}
call_srcu(&eventfs_srcu, &ef->rcu, free_ef);
} }
mutex_unlock(&eventfs_mutex); mutex_unlock(&eventfs_mutex);
...@@ -876,8 +893,8 @@ void eventfs_remove(struct eventfs_file *ef) ...@@ -876,8 +893,8 @@ void eventfs_remove(struct eventfs_file *ef)
mutex_lock(&eventfs_mutex); mutex_lock(&eventfs_mutex);
/* dentry should now have at least a single reference */ /* dentry should now have at least a single reference */
WARN_ONCE((int)d_count(dentry) < 1, WARN_ONCE((int)d_count(dentry) < 1,
"dentry %p less than one reference (%d) after invalidate\n", "dentry %px (%s) less than one reference (%d) after invalidate\n",
dentry, d_count(dentry)); dentry, dentry->d_name.name, d_count(dentry));
mutex_unlock(&eventfs_mutex); mutex_unlock(&eventfs_mutex);
dput(dentry); dput(dentry);
} }
......
...@@ -385,7 +385,7 @@ static void tracefs_dentry_iput(struct dentry *dentry, struct inode *inode) ...@@ -385,7 +385,7 @@ static void tracefs_dentry_iput(struct dentry *dentry, struct inode *inode)
ti = get_tracefs(inode); ti = get_tracefs(inode);
if (ti && ti->flags & TRACEFS_EVENT_INODE) if (ti && ti->flags & TRACEFS_EVENT_INODE)
eventfs_set_ef_status_free(ti, dentry); eventfs_set_ei_status_free(ti, dentry);
iput(inode); iput(inode);
} }
......
...@@ -13,6 +13,41 @@ struct tracefs_inode { ...@@ -13,6 +13,41 @@ struct tracefs_inode {
struct inode vfs_inode; struct inode vfs_inode;
}; };
/*
* struct eventfs_inode - hold the properties of the eventfs directories.
* @list: link list into the parent directory
* @entries: the array of entries representing the files in the directory
* @name: the name of the directory to create
* @children: link list into the child eventfs_inode
* @dentry: the dentry of the directory
* @d_parent: pointer to the parent's dentry
* @d_children: The array of dentries to represent the files when created
* @data: The private data to pass to the callbacks
* @nr_entries: The number of items in @entries
*/
struct eventfs_inode {
struct list_head list;
const struct eventfs_entry *entries;
const char *name;
struct list_head children;
struct dentry *dentry;
struct dentry *d_parent;
struct dentry **d_children;
void *data;
/*
* Union - used for deletion
* @del_list: list of eventfs_inode to delete
* @rcu: eventfs_indoe to delete in RCU
* @is_freed: node is freed if one of the above is set
*/
union {
struct list_head del_list;
struct rcu_head rcu;
unsigned long is_freed;
};
int nr_entries;
};
static inline struct tracefs_inode *get_tracefs(const struct inode *inode) static inline struct tracefs_inode *get_tracefs(const struct inode *inode)
{ {
return container_of(inode, struct tracefs_inode, vfs_inode); return container_of(inode, struct tracefs_inode, vfs_inode);
...@@ -25,6 +60,6 @@ struct inode *tracefs_get_inode(struct super_block *sb); ...@@ -25,6 +60,6 @@ struct inode *tracefs_get_inode(struct super_block *sb);
struct dentry *eventfs_start_creating(const char *name, struct dentry *parent); struct dentry *eventfs_start_creating(const char *name, struct dentry *parent);
struct dentry *eventfs_failed_creating(struct dentry *dentry); struct dentry *eventfs_failed_creating(struct dentry *dentry);
struct dentry *eventfs_end_creating(struct dentry *dentry); struct dentry *eventfs_end_creating(struct dentry *dentry);
void eventfs_set_ef_status_free(struct tracefs_inode *ti, struct dentry *dentry); void eventfs_set_ei_status_free(struct tracefs_inode *ti, struct dentry *dentry);
#endif /* _TRACEFS_INTERNAL_H */ #endif /* _TRACEFS_INTERNAL_H */
...@@ -649,7 +649,7 @@ struct trace_event_file { ...@@ -649,7 +649,7 @@ struct trace_event_file {
struct list_head list; struct list_head list;
struct trace_event_call *event_call; struct trace_event_call *event_call;
struct event_filter __rcu *filter; struct event_filter __rcu *filter;
struct eventfs_file *ef; struct eventfs_inode *ei;
struct trace_array *tr; struct trace_array *tr;
struct trace_subsystem_dir *system; struct trace_subsystem_dir *system;
struct list_head triggers; struct list_head triggers;
......
...@@ -23,26 +23,25 @@ struct file_operations; ...@@ -23,26 +23,25 @@ struct file_operations;
struct eventfs_file; struct eventfs_file;
struct dentry *eventfs_create_events_dir(const char *name, typedef int (*eventfs_callback)(const char *name, umode_t *mode, void **data,
struct dentry *parent); const struct file_operations **fops);
struct eventfs_file *eventfs_add_subsystem_dir(const char *name, struct eventfs_entry {
struct dentry *parent); const char *name;
eventfs_callback callback;
};
struct eventfs_file *eventfs_add_dir(const char *name, struct eventfs_inode;
struct eventfs_file *ef_parent);
int eventfs_add_file(const char *name, umode_t mode, struct eventfs_inode *eventfs_create_events_dir(const char *name, struct dentry *parent,
struct eventfs_file *ef_parent, void *data, const struct eventfs_entry *entries,
const struct file_operations *fops); int size, void *data);
int eventfs_add_events_file(const char *name, umode_t mode, struct eventfs_inode *eventfs_create_dir(const char *name, struct eventfs_inode *parent,
struct dentry *parent, void *data, const struct eventfs_entry *entries,
const struct file_operations *fops); int size, void *data);
void eventfs_remove(struct eventfs_file *ef); void eventfs_remove_dir(struct eventfs_inode *ei);
void eventfs_remove_events_dir(struct dentry *dentry);
struct dentry *tracefs_create_file(const char *name, umode_t mode, struct dentry *tracefs_create_file(const char *name, umode_t mode,
struct dentry *parent, void *data, struct dentry *parent, void *data,
......
...@@ -9764,7 +9764,6 @@ static __init void create_trace_instances(struct dentry *d_tracer) ...@@ -9764,7 +9764,6 @@ static __init void create_trace_instances(struct dentry *d_tracer)
static void static void
init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer) init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer)
{ {
struct trace_event_file *file;
int cpu; int cpu;
trace_create_file("available_tracers", TRACE_MODE_READ, d_tracer, trace_create_file("available_tracers", TRACE_MODE_READ, d_tracer,
...@@ -9797,11 +9796,7 @@ init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer) ...@@ -9797,11 +9796,7 @@ init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer)
trace_create_file("trace_marker", 0220, d_tracer, trace_create_file("trace_marker", 0220, d_tracer,
tr, &tracing_mark_fops); tr, &tracing_mark_fops);
file = __find_event_file(tr, "ftrace", "print"); tr->trace_marker_file = __find_event_file(tr, "ftrace", "print");
if (file && file->ef)
eventfs_add_file("trigger", TRACE_MODE_WRITE, file->ef,
file, &event_trigger_fops);
tr->trace_marker_file = file;
trace_create_file("trace_marker_raw", 0220, d_tracer, trace_create_file("trace_marker_raw", 0220, d_tracer,
tr, &tracing_mark_raw_fops); tr, &tracing_mark_raw_fops);
......
...@@ -381,7 +381,7 @@ struct trace_array { ...@@ -381,7 +381,7 @@ struct trace_array {
struct dentry *dir; struct dentry *dir;
struct dentry *options; struct dentry *options;
struct dentry *percpu_dir; struct dentry *percpu_dir;
struct dentry *event_dir; struct eventfs_inode *event_dir;
struct trace_options *topts; struct trace_options *topts;
struct list_head systems; struct list_head systems;
struct list_head events; struct list_head events;
...@@ -1349,7 +1349,7 @@ struct trace_subsystem_dir { ...@@ -1349,7 +1349,7 @@ struct trace_subsystem_dir {
struct list_head list; struct list_head list;
struct event_subsystem *subsystem; struct event_subsystem *subsystem;
struct trace_array *tr; struct trace_array *tr;
struct eventfs_file *ef; struct eventfs_inode *ei;
int ref_count; int ref_count;
int nr_events; int nr_events;
}; };
......
...@@ -984,7 +984,7 @@ static void remove_subsystem(struct trace_subsystem_dir *dir) ...@@ -984,7 +984,7 @@ static void remove_subsystem(struct trace_subsystem_dir *dir)
return; return;
if (!--dir->nr_events) { if (!--dir->nr_events) {
eventfs_remove(dir->ef); eventfs_remove_dir(dir->ei);
list_del(&dir->list); list_del(&dir->list);
__put_system_dir(dir); __put_system_dir(dir);
} }
...@@ -992,7 +992,7 @@ static void remove_subsystem(struct trace_subsystem_dir *dir) ...@@ -992,7 +992,7 @@ static void remove_subsystem(struct trace_subsystem_dir *dir)
static void remove_event_file_dir(struct trace_event_file *file) static void remove_event_file_dir(struct trace_event_file *file)
{ {
eventfs_remove(file->ef); eventfs_remove_dir(file->ei);
list_del(&file->list); list_del(&file->list);
remove_subsystem(file->system); remove_subsystem(file->system);
free_event_filter(file->filter); free_event_filter(file->filter);
...@@ -2282,14 +2282,40 @@ create_new_subsystem(const char *name) ...@@ -2282,14 +2282,40 @@ create_new_subsystem(const char *name)
return NULL; return NULL;
} }
static struct eventfs_file * int system_callback(const char *name, umode_t *mode, void **data,
const struct file_operations **fops)
{
if (strcmp(name, "filter") == 0)
*fops = &ftrace_subsystem_filter_fops;
else if (strcmp(name, "enable") == 0)
*fops = &ftrace_system_enable_fops;
else
return 0;
*mode = TRACE_MODE_WRITE;
return 1;
}
static struct eventfs_inode *
event_subsystem_dir(struct trace_array *tr, const char *name, event_subsystem_dir(struct trace_array *tr, const char *name,
struct trace_event_file *file, struct dentry *parent) struct trace_event_file *file, struct eventfs_inode *parent)
{ {
struct event_subsystem *system, *iter; struct event_subsystem *system, *iter;
struct trace_subsystem_dir *dir; struct trace_subsystem_dir *dir;
struct eventfs_file *ef; struct eventfs_inode *ei;
int res; int nr_entries;
static struct eventfs_entry system_entries[] = {
{
.name = "filter",
.callback = system_callback,
},
{
.name = "enable",
.callback = system_callback,
}
};
/* First see if we did not already create this dir */ /* First see if we did not already create this dir */
list_for_each_entry(dir, &tr->systems, list) { list_for_each_entry(dir, &tr->systems, list) {
...@@ -2297,7 +2323,7 @@ event_subsystem_dir(struct trace_array *tr, const char *name, ...@@ -2297,7 +2323,7 @@ event_subsystem_dir(struct trace_array *tr, const char *name,
if (strcmp(system->name, name) == 0) { if (strcmp(system->name, name) == 0) {
dir->nr_events++; dir->nr_events++;
file->system = dir; file->system = dir;
return dir->ef; return dir->ei;
} }
} }
...@@ -2321,39 +2347,29 @@ event_subsystem_dir(struct trace_array *tr, const char *name, ...@@ -2321,39 +2347,29 @@ event_subsystem_dir(struct trace_array *tr, const char *name,
} else } else
__get_system(system); __get_system(system);
ef = eventfs_add_subsystem_dir(name, parent); /* ftrace only has directories no files */
if (IS_ERR(ef)) { if (strcmp(name, "ftrace") == 0)
nr_entries = 0;
else
nr_entries = ARRAY_SIZE(system_entries);
ei = eventfs_create_dir(name, parent, system_entries, nr_entries, dir);
if (!ei) {
pr_warn("Failed to create system directory %s\n", name); pr_warn("Failed to create system directory %s\n", name);
__put_system(system); __put_system(system);
goto out_free; goto out_free;
} }
dir->ef = ef; dir->ei = ei;
dir->tr = tr; dir->tr = tr;
dir->ref_count = 1; dir->ref_count = 1;
dir->nr_events = 1; dir->nr_events = 1;
dir->subsystem = system; dir->subsystem = system;
file->system = dir; file->system = dir;
/* the ftrace system is special, do not create enable or filter files */
if (strcmp(name, "ftrace") != 0) {
res = eventfs_add_file("filter", TRACE_MODE_WRITE,
dir->ef, dir,
&ftrace_subsystem_filter_fops);
if (res) {
kfree(system->filter);
system->filter = NULL;
pr_warn("Could not create tracefs '%s/filter' entry\n", name);
}
eventfs_add_file("enable", TRACE_MODE_WRITE, dir->ef, dir,
&ftrace_system_enable_fops);
}
list_add(&dir->list, &tr->systems); list_add(&dir->list, &tr->systems);
return dir->ef; return dir->ei;
out_free: out_free:
kfree(dir); kfree(dir);
...@@ -2402,15 +2418,134 @@ event_define_fields(struct trace_event_call *call) ...@@ -2402,15 +2418,134 @@ event_define_fields(struct trace_event_call *call)
return ret; return ret;
} }
static int event_callback(const char *name, umode_t *mode, void **data,
const struct file_operations **fops)
{
struct trace_event_file *file = *data;
struct trace_event_call *call = file->event_call;
if (strcmp(name, "format") == 0) {
*mode = TRACE_MODE_READ;
*fops = &ftrace_event_format_fops;
*data = call;
return 1;
}
/*
* Only event directories that can be enabled should have
* triggers or filters, with the exception of the "print"
* event that can have a "trigger" file.
*/
if (!(call->flags & TRACE_EVENT_FL_IGNORE_ENABLE)) {
if (call->class->reg && strcmp(name, "enable") == 0) {
*mode = TRACE_MODE_WRITE;
*fops = &ftrace_enable_fops;
return 1;
}
if (strcmp(name, "filter") == 0) {
*mode = TRACE_MODE_WRITE;
*fops = &ftrace_event_filter_fops;
return 1;
}
}
if (!(call->flags & TRACE_EVENT_FL_IGNORE_ENABLE) ||
strcmp(trace_event_name(call), "print") == 0) {
if (strcmp(name, "trigger") == 0) {
*mode = TRACE_MODE_WRITE;
*fops = &event_trigger_fops;
return 1;
}
}
#ifdef CONFIG_PERF_EVENTS
if (call->event.type && call->class->reg &&
strcmp(name, "id") == 0) {
*mode = TRACE_MODE_READ;
*data = (void *)(long)call->event.type;
*fops = &ftrace_event_id_fops;
return 1;
}
#endif
#ifdef CONFIG_HIST_TRIGGERS
if (strcmp(name, "hist") == 0) {
*mode = TRACE_MODE_READ;
*fops = &event_hist_fops;
return 1;
}
#endif
#ifdef CONFIG_HIST_TRIGGERS_DEBUG
if (strcmp(name, "hist_debug") == 0) {
*mode = TRACE_MODE_READ;
*fops = &event_hist_debug_fops;
return 1;
}
#endif
#ifdef CONFIG_TRACE_EVENT_INJECT
if (call->event.type && call->class->reg &&
strcmp(name, "inject") == 0) {
*mode = 0200;
*fops = &event_inject_fops;
return 1;
}
#endif
return 0;
}
static int static int
event_create_dir(struct dentry *parent, struct trace_event_file *file) event_create_dir(struct eventfs_inode *parent, struct trace_event_file *file)
{ {
struct trace_event_call *call = file->event_call; struct trace_event_call *call = file->event_call;
struct eventfs_file *ef_subsystem = NULL;
struct trace_array *tr = file->tr; struct trace_array *tr = file->tr;
struct eventfs_file *ef; struct eventfs_inode *e_events;
struct eventfs_inode *ei;
const char *name; const char *name;
int nr_entries;
int ret; int ret;
static struct eventfs_entry event_entries[] = {
{
.name = "enable",
.callback = event_callback,
},
{
.name = "filter",
.callback = event_callback,
},
{
.name = "trigger",
.callback = event_callback,
},
{
.name = "format",
.callback = event_callback,
},
#ifdef CONFIG_PERF_EVENTS
{
.name = "id",
.callback = event_callback,
},
#endif
#ifdef CONFIG_HIST_TRIGGERS
{
.name = "hist",
.callback = event_callback,
},
#endif
#ifdef CONFIG_HIST_TRIGGERS_DEBUG
{
.name = "hist_debug",
.callback = event_callback,
},
#endif
#ifdef CONFIG_TRACE_EVENT_INJECT
{
.name = "inject",
.callback = event_callback,
},
#endif
};
/* /*
* If the trace point header did not define TRACE_SYSTEM * If the trace point header did not define TRACE_SYSTEM
...@@ -2420,29 +2555,20 @@ event_create_dir(struct dentry *parent, struct trace_event_file *file) ...@@ -2420,29 +2555,20 @@ event_create_dir(struct dentry *parent, struct trace_event_file *file)
if (WARN_ON_ONCE(strcmp(call->class->system, TRACE_SYSTEM) == 0)) if (WARN_ON_ONCE(strcmp(call->class->system, TRACE_SYSTEM) == 0))
return -ENODEV; return -ENODEV;
ef_subsystem = event_subsystem_dir(tr, call->class->system, file, parent); e_events = event_subsystem_dir(tr, call->class->system, file, parent);
if (!ef_subsystem) if (!e_events)
return -ENOMEM; return -ENOMEM;
nr_entries = ARRAY_SIZE(event_entries);
name = trace_event_name(call); name = trace_event_name(call);
ef = eventfs_add_dir(name, ef_subsystem); ei = eventfs_create_dir(name, e_events, event_entries, nr_entries, file);
if (IS_ERR(ef)) { if (IS_ERR(ei)) {
pr_warn("Could not create tracefs '%s' directory\n", name); pr_warn("Could not create tracefs '%s' directory\n", name);
return -1; return -1;
} }
file->ef = ef; file->ei = ei;
if (call->class->reg && !(call->flags & TRACE_EVENT_FL_IGNORE_ENABLE))
eventfs_add_file("enable", TRACE_MODE_WRITE, file->ef, file,
&ftrace_enable_fops);
#ifdef CONFIG_PERF_EVENTS
if (call->event.type && call->class->reg)
eventfs_add_file("id", TRACE_MODE_READ, file->ef,
(void *)(long)call->event.type,
&ftrace_event_id_fops);
#endif
ret = event_define_fields(call); ret = event_define_fields(call);
if (ret < 0) { if (ret < 0) {
...@@ -2450,35 +2576,6 @@ event_create_dir(struct dentry *parent, struct trace_event_file *file) ...@@ -2450,35 +2576,6 @@ event_create_dir(struct dentry *parent, struct trace_event_file *file)
return ret; return ret;
} }
/*
* Only event directories that can be enabled should have
* triggers or filters.
*/
if (!(call->flags & TRACE_EVENT_FL_IGNORE_ENABLE)) {
eventfs_add_file("filter", TRACE_MODE_WRITE, file->ef,
file, &ftrace_event_filter_fops);
eventfs_add_file("trigger", TRACE_MODE_WRITE, file->ef,
file, &event_trigger_fops);
}
#ifdef CONFIG_HIST_TRIGGERS
eventfs_add_file("hist", TRACE_MODE_READ, file->ef, file,
&event_hist_fops);
#endif
#ifdef CONFIG_HIST_TRIGGERS_DEBUG
eventfs_add_file("hist_debug", TRACE_MODE_READ, file->ef, file,
&event_hist_debug_fops);
#endif
eventfs_add_file("format", TRACE_MODE_READ, file->ef, call,
&ftrace_event_format_fops);
#ifdef CONFIG_TRACE_EVENT_INJECT
if (call->event.type && call->class->reg)
eventfs_add_file("inject", 0200, file->ef, file,
&event_inject_fops);
#endif
return 0; return 0;
} }
...@@ -3623,30 +3720,65 @@ static __init int setup_trace_event(char *str) ...@@ -3623,30 +3720,65 @@ static __init int setup_trace_event(char *str)
} }
__setup("trace_event=", setup_trace_event); __setup("trace_event=", setup_trace_event);
static int events_callback(const char *name, umode_t *mode, void **data,
const struct file_operations **fops)
{
if (strcmp(name, "enable") == 0) {
*mode = TRACE_MODE_WRITE;
*fops = &ftrace_tr_enable_fops;
return 1;
}
if (strcmp(name, "header_page") == 0)
*data = ring_buffer_print_page_header;
else if (strcmp(name, "header_event") == 0)
*data = ring_buffer_print_entry_header;
else
return 0;
*mode = TRACE_MODE_READ;
*fops = &ftrace_show_header_fops;
return 1;
}
/* Expects to have event_mutex held when called */ /* Expects to have event_mutex held when called */
static int static int
create_event_toplevel_files(struct dentry *parent, struct trace_array *tr) create_event_toplevel_files(struct dentry *parent, struct trace_array *tr)
{ {
struct dentry *d_events; struct eventfs_inode *e_events;
struct dentry *entry; struct dentry *entry;
int error = 0; int nr_entries;
static struct eventfs_entry events_entries[] = {
{
.name = "enable",
.callback = events_callback,
},
{
.name = "header_page",
.callback = events_callback,
},
{
.name = "header_event",
.callback = events_callback,
},
};
entry = trace_create_file("set_event", TRACE_MODE_WRITE, parent, entry = trace_create_file("set_event", TRACE_MODE_WRITE, parent,
tr, &ftrace_set_event_fops); tr, &ftrace_set_event_fops);
if (!entry) if (!entry)
return -ENOMEM; return -ENOMEM;
d_events = eventfs_create_events_dir("events", parent); nr_entries = ARRAY_SIZE(events_entries);
if (IS_ERR(d_events)) {
e_events = eventfs_create_events_dir("events", parent, events_entries,
nr_entries, tr);
if (IS_ERR(e_events)) {
pr_warn("Could not create tracefs 'events' directory\n"); pr_warn("Could not create tracefs 'events' directory\n");
return -ENOMEM; return -ENOMEM;
} }
error = eventfs_add_events_file("enable", TRACE_MODE_WRITE, d_events,
tr, &ftrace_tr_enable_fops);
if (error)
return -ENOMEM;
/* There are not as crucial, just warn if they are not created */ /* There are not as crucial, just warn if they are not created */
trace_create_file("set_event_pid", TRACE_MODE_WRITE, parent, trace_create_file("set_event_pid", TRACE_MODE_WRITE, parent,
...@@ -3656,16 +3788,7 @@ create_event_toplevel_files(struct dentry *parent, struct trace_array *tr) ...@@ -3656,16 +3788,7 @@ create_event_toplevel_files(struct dentry *parent, struct trace_array *tr)
TRACE_MODE_WRITE, parent, tr, TRACE_MODE_WRITE, parent, tr,
&ftrace_set_event_notrace_pid_fops); &ftrace_set_event_notrace_pid_fops);
/* ring buffer internal formats */ tr->event_dir = e_events;
eventfs_add_events_file("header_page", TRACE_MODE_READ, d_events,
ring_buffer_print_page_header,
&ftrace_show_header_fops);
eventfs_add_events_file("header_event", TRACE_MODE_READ, d_events,
ring_buffer_print_entry_header,
&ftrace_show_header_fops);
tr->event_dir = d_events;
return 0; return 0;
} }
...@@ -3749,7 +3872,7 @@ int event_trace_del_tracer(struct trace_array *tr) ...@@ -3749,7 +3872,7 @@ int event_trace_del_tracer(struct trace_array *tr)
down_write(&trace_event_sem); down_write(&trace_event_sem);
__trace_remove_event_dirs(tr); __trace_remove_event_dirs(tr);
eventfs_remove_events_dir(tr->event_dir); eventfs_remove_dir(tr->event_dir);
up_write(&trace_event_sem); up_write(&trace_event_sem);
tr->event_dir = NULL; tr->event_dir = NULL;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment