Commit 29552b14 authored by Linus Torvalds's avatar Linus Torvalds
parents 6c59f9d9 51e7a598
......@@ -12,10 +12,14 @@ cifs.txt
- description of the CIFS filesystem
coda.txt
- description of the CODA filesystem.
configfs/
- directory containing configfs documentation and example code.
cramfs.txt
- info on the cram filesystem for small storage (ROMs etc)
devfs/
- directory containing devfs documentation.
dlmfs.txt
- info on the userspace interface to the OCFS2 DLM.
ext2.txt
- info, mount options and specifications for the Ext2 filesystem.
hpfs.txt
......@@ -30,6 +34,8 @@ ntfs.txt
- info and mount options for the NTFS filesystem (Windows NT).
proc.txt
- info on Linux's /proc filesystem.
ocfs2.txt
- info and mount options for the OCFS2 clustered filesystem.
romfs.txt
- Description of the ROMFS filesystem.
smbfs.txt
......
This diff is collapsed.
This diff is collapsed.
dlmfs
==================
A minimal DLM userspace interface implemented via a virtual file
system.
dlmfs is built with OCFS2 as it requires most of its infrastructure.
Project web page: http://oss.oracle.com/projects/ocfs2
Tools web page: http://oss.oracle.com/projects/ocfs2-tools
OCFS2 mailing lists: http://oss.oracle.com/projects/ocfs2/mailman/
All code copyright 2005 Oracle except when otherwise noted.
CREDITS
=======
Some code taken from ramfs which is Copyright (C) 2000 Linus Torvalds
and Transmeta Corp.
Mark Fasheh <mark.fasheh@oracle.com>
Caveats
=======
- Right now it only works with the OCFS2 DLM, though support for other
DLM implementations should not be a major issue.
Mount options
=============
None
Usage
=====
If you're just interested in OCFS2, then please see ocfs2.txt. The
rest of this document will be geared towards those who want to use
dlmfs for easy to setup and easy to use clustered locking in
userspace.
Setup
=====
dlmfs requires that the OCFS2 cluster infrastructure be in
place. Please download ocfs2-tools from the above url and configure a
cluster.
You'll want to start heartbeating on a volume which all the nodes in
your lockspace can access. The easiest way to do this is via
ocfs2_hb_ctl (distributed with ocfs2-tools). Right now it requires
that an OCFS2 file system be in place so that it can automatically
find it's heartbeat area, though it will eventually support heartbeat
against raw disks.
Please see the ocfs2_hb_ctl and mkfs.ocfs2 manual pages distributed
with ocfs2-tools.
Once you're heartbeating, DLM lock 'domains' can be easily created /
destroyed and locks within them accessed.
Locking
=======
Users may access dlmfs via standard file system calls, or they can use
'libo2dlm' (distributed with ocfs2-tools) which abstracts the file
system calls and presents a more traditional locking api.
dlmfs handles lock caching automatically for the user, so a lock
request for an already acquired lock will not generate another DLM
call. Userspace programs are assumed to handle their own local
locking.
Two levels of locks are supported - Shared Read, and Exlcusive.
Also supported is a Trylock operation.
For information on the libo2dlm interface, please see o2dlm.h,
distributed with ocfs2-tools.
Lock value blocks can be read and written to a resource via read(2)
and write(2) against the fd obtained via your open(2) call. The
maximum currently supported LVB length is 64 bytes (though that is an
OCFS2 DLM limitation). Through this mechanism, users of dlmfs can share
small amounts of data amongst their nodes.
mkdir(2) signals dlmfs to join a domain (which will have the same name
as the resulting directory)
rmdir(2) signals dlmfs to leave the domain
Locks for a given domain are represented by regular inodes inside the
domain directory. Locking against them is done via the open(2) system
call.
The open(2) call will not return until your lock has been granted or
an error has occurred, unless it has been instructed to do a trylock
operation. If the lock succeeds, you'll get an fd.
open(2) with O_CREAT to ensure the resource inode is created - dlmfs does
not automatically create inodes for existing lock resources.
Open Flag Lock Request Type
--------- -----------------
O_RDONLY Shared Read
O_RDWR Exclusive
Open Flag Resulting Locking Behavior
--------- --------------------------
O_NONBLOCK Trylock operation
You must provide exactly one of O_RDONLY or O_RDWR.
If O_NONBLOCK is also provided and the trylock operation was valid but
could not lock the resource then open(2) will return ETXTBUSY.
close(2) drops the lock associated with your fd.
Modes passed to mkdir(2) or open(2) are adhered to locally. Chown is
supported locally as well. This means you can use them to restrict
access to the resources via dlmfs on your local node only.
The resource LVB may be read from the fd in either Shared Read or
Exclusive modes via the read(2) system call. It can be written via
write(2) only when open in Exclusive mode.
Once written, an LVB will be visible to other nodes who obtain Read
Only or higher level locks on the resource.
See Also
========
http://opendlm.sourceforge.net/cvsmirror/opendlm/docs/dlmbook_final.pdf
For more information on the VMS distributed locking API.
OCFS2 filesystem
==================
OCFS2 is a general purpose extent based shared disk cluster file
system with many similarities to ext3. It supports 64 bit inode
numbers, and has automatically extending metadata groups which may
also make it attractive for non-clustered use.
You'll want to install the ocfs2-tools package in order to at least
get "mount.ocfs2" and "ocfs2_hb_ctl".
Project web page: http://oss.oracle.com/projects/ocfs2
Tools web page: http://oss.oracle.com/projects/ocfs2-tools
OCFS2 mailing lists: http://oss.oracle.com/projects/ocfs2/mailman/
All code copyright 2005 Oracle except when otherwise noted.
CREDITS:
Lots of code taken from ext3 and other projects.
Authors in alphabetical order:
Joel Becker <joel.becker@oracle.com>
Zach Brown <zach.brown@oracle.com>
Mark Fasheh <mark.fasheh@oracle.com>
Kurt Hackel <kurt.hackel@oracle.com>
Sunil Mushran <sunil.mushran@oracle.com>
Manish Singh <manish.singh@oracle.com>
Caveats
=======
Features which OCFS2 does not support yet:
- sparse files
- extended attributes
- shared writeable mmap
- loopback is supported, but data written will not
be cluster coherent.
- quotas
- cluster aware flock
- Directory change notification (F_NOTIFY)
- Distributed Caching (F_SETLEASE/F_GETLEASE/break_lease)
- POSIX ACLs
- readpages / writepages (not user visible)
Mount options
=============
OCFS2 supports the following mount options:
(*) == default
barrier=1 This enables/disables barriers. barrier=0 disables it,
barrier=1 enables it.
errors=remount-ro(*) Remount the filesystem read-only on an error.
errors=panic Panic and halt the machine if an error occurs.
intr (*) Allow signals to interrupt cluster operations.
nointr Do not allow signals to interrupt cluster
operations.
......@@ -554,6 +554,11 @@ W: http://us1.samba.org/samba/Linux_CIFS_client.html
T: git kernel.org:/pub/scm/linux/kernel/git/sfrench/cifs-2.6.git
S: Supported
CONFIGFS
P: Joel Becker
M: Joel Becker <joel.becker@oracle.com>
S: Supported
CIRRUS LOGIC GENERIC FBDEV DRIVER
P: Jeff Garzik
M: jgarzik@pobox.com
......@@ -1898,6 +1903,15 @@ M: ajoshi@shell.unixbox.com
L: linux-nvidia@lists.surfsouth.com
S: Maintained
ORACLE CLUSTER FILESYSTEM 2 (OCFS2)
P: Mark Fasheh
M: mark.fasheh@oracle.com
P: Kurt Hackel
M: kurt.hackel@oracle.com
L: ocfs2-devel@oss.oracle.com
W: http://oss.oracle.com/projects/ocfs2/
S: Supported
OLYMPIC NETWORK DRIVER
P: Peter De Shrijver
M: p2@ace.ulyssis.student.kuleuven.ac.be
......
......@@ -213,7 +213,7 @@ static int do_lo_send_aops(struct loop_device *lo, struct bio_vec *bvec,
struct address_space_operations *aops = mapping->a_ops;
pgoff_t index;
unsigned offset, bv_offs;
int len, ret = 0;
int len, ret;
down(&mapping->host->i_sem);
index = pos >> PAGE_CACHE_SHIFT;
......@@ -232,9 +232,15 @@ static int do_lo_send_aops(struct loop_device *lo, struct bio_vec *bvec,
page = grab_cache_page(mapping, index);
if (unlikely(!page))
goto fail;
if (unlikely(aops->prepare_write(file, page, offset,
offset + size)))
ret = aops->prepare_write(file, page, offset,
offset + size);
if (unlikely(ret)) {
if (ret == AOP_TRUNCATED_PAGE) {
page_cache_release(page);
continue;
}
goto unlock;
}
transfer_result = lo_do_transfer(lo, WRITE, page, offset,
bvec->bv_page, bv_offs, size, IV);
if (unlikely(transfer_result)) {
......@@ -251,9 +257,15 @@ static int do_lo_send_aops(struct loop_device *lo, struct bio_vec *bvec,
kunmap_atomic(kaddr, KM_USER0);
}
flush_dcache_page(page);
if (unlikely(aops->commit_write(file, page, offset,
offset + size)))
ret = aops->commit_write(file, page, offset,
offset + size);
if (unlikely(ret)) {
if (ret == AOP_TRUNCATED_PAGE) {
page_cache_release(page);
continue;
}
goto unlock;
}
if (unlikely(transfer_result))
goto unlock;
bv_offs += size;
......@@ -264,6 +276,7 @@ static int do_lo_send_aops(struct loop_device *lo, struct bio_vec *bvec,
unlock_page(page);
page_cache_release(page);
}
ret = 0;
out:
up(&mapping->host->i_sem);
return ret;
......
......@@ -154,7 +154,7 @@ static int ramdisk_commit_write(struct file *file, struct page *page,
/*
* ->writepage to the the blockdev's mapping has to redirty the page so that the
* VM doesn't go and steal it. We return WRITEPAGE_ACTIVATE so that the VM
* VM doesn't go and steal it. We return AOP_WRITEPAGE_ACTIVATE so that the VM
* won't try to (pointlessly) write the page again for a while.
*
* Really, these pages should not be on the LRU at all.
......@@ -165,7 +165,7 @@ static int ramdisk_writepage(struct page *page, struct writeback_control *wbc)
make_page_uptodate(page);
SetPageDirty(page);
if (wbc->for_reclaim)
return WRITEPAGE_ACTIVATE;
return AOP_WRITEPAGE_ACTIVATE;
unlock_page(page);
return 0;
}
......
......@@ -70,6 +70,7 @@ config FS_XIP
config EXT3_FS
tristate "Ext3 journalling file system support"
select JBD
help
This is the journaling version of the Second extended file system
(often called ext3), the de facto standard Linux file system
......@@ -138,23 +139,20 @@ config EXT3_FS_SECURITY
extended attributes for file security labels, say N.
config JBD
# CONFIG_JBD could be its own option (even modular), but until there are
# other users than ext3, we will simply make it be the same as CONFIG_EXT3_FS
# dep_tristate ' Journal Block Device support (JBD for ext3)' CONFIG_JBD $CONFIG_EXT3_FS
tristate
default EXT3_FS
help
This is a generic journaling layer for block devices. It is
currently used by the ext3 file system, but it could also be used to
add journal support to other file systems or block devices such as
RAID or LVM.
currently used by the ext3 and OCFS2 file systems, but it could
also be used to add journal support to other file systems or block
devices such as RAID or LVM.
If you are using the ext3 file system, you need to say Y here. If
you are not using ext3 then you will probably want to say N.
If you are using the ext3 or OCFS2 file systems, you need to
say Y here. If you are not using ext3 OCFS2 then you will probably
want to say N.
To compile this device as a module, choose M here: the module will be
called jbd. If you are compiling ext3 into the kernel, you cannot
compile this code as a module.
called jbd. If you are compiling ext3 or OCFS2 into the kernel,
you cannot compile this code as a module.
config JBD_DEBUG
bool "JBD (ext3) debugging support"
......@@ -326,6 +324,38 @@ config FS_POSIX_ACL
source "fs/xfs/Kconfig"
config OCFS2_FS
tristate "OCFS2 file system support (EXPERIMENTAL)"
depends on NET && EXPERIMENTAL
select CONFIGFS_FS
select JBD
select CRC32
select INET
help
OCFS2 is a general purpose extent based shared disk cluster file
system with many similarities to ext3. It supports 64 bit inode
numbers, and has automatically extending metadata groups which may
also make it attractive for non-clustered use.
You'll want to install the ocfs2-tools package in order to at least
get "mount.ocfs2".
Project web page: http://oss.oracle.com/projects/ocfs2
Tools web page: http://oss.oracle.com/projects/ocfs2-tools
OCFS2 mailing lists: http://oss.oracle.com/projects/ocfs2/mailman/
Note: Features which OCFS2 does not support yet:
- extended attributes
- shared writeable mmap
- loopback is supported, but data written will not
be cluster coherent.
- quotas
- cluster aware flock
- Directory change notification (F_NOTIFY)
- Distributed Caching (F_SETLEASE/F_GETLEASE/break_lease)
- POSIX ACLs
- readpages / writepages (not user visible)
config MINIX_FS
tristate "Minix fs support"
help
......@@ -841,6 +871,20 @@ config RELAYFS_FS
If unsure, say N.
config CONFIGFS_FS
tristate "Userspace-driven configuration filesystem (EXPERIMENTAL)"
depends on EXPERIMENTAL
help
configfs is a ram-based filesystem that provides the converse
of sysfs's functionality. Where sysfs is a filesystem-based
view of kernel objects, configfs is a filesystem-based manager
of kernel objects, or config_items.
Both sysfs and configfs can and should exist together on the
same system. One is not a replacement for the other.
If unsure, say N.
endmenu
menu "Miscellaneous filesystems"
......
......@@ -101,3 +101,5 @@ obj-$(CONFIG_BEFS_FS) += befs/
obj-$(CONFIG_HOSTFS) += hostfs/
obj-$(CONFIG_HPPFS) += hppfs/
obj-$(CONFIG_DEBUG_FS) += debugfs/
obj-$(CONFIG_CONFIGFS_FS) += configfs/
obj-$(CONFIG_OCFS2_FS) += ocfs2/
#
# Makefile for the configfs virtual filesystem
#
obj-$(CONFIG_CONFIGFS_FS) += configfs.o
configfs-objs := inode.o file.o dir.o symlink.o mount.o item.o
/* -*- mode: c; c-basic-offset:8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* configfs_internal.h - Internal stuff for configfs
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*
* Based on sysfs:
* sysfs is Copyright (C) 2001, 2002, 2003 Patrick Mochel
*
* configfs Copyright (C) 2005 Oracle. All rights reserved.
*/
#include <linux/slab.h>
#include <linux/list.h>
struct configfs_dirent {
atomic_t s_count;
struct list_head s_sibling;
struct list_head s_children;
struct list_head s_links;
void * s_element;
int s_type;
umode_t s_mode;
struct dentry * s_dentry;
};
#define CONFIGFS_ROOT 0x0001
#define CONFIGFS_DIR 0x0002
#define CONFIGFS_ITEM_ATTR 0x0004
#define CONFIGFS_ITEM_LINK 0x0020
#define CONFIGFS_USET_DIR 0x0040
#define CONFIGFS_USET_DEFAULT 0x0080
#define CONFIGFS_USET_DROPPING 0x0100
#define CONFIGFS_NOT_PINNED (CONFIGFS_ITEM_ATTR)
extern struct vfsmount * configfs_mount;
extern int configfs_is_root(struct config_item *item);
extern struct inode * configfs_new_inode(mode_t mode);
extern int configfs_create(struct dentry *, int mode, int (*init)(struct inode *));
extern int configfs_create_file(struct config_item *, const struct configfs_attribute *);
extern int configfs_make_dirent(struct configfs_dirent *,
struct dentry *, void *, umode_t, int);
extern int configfs_add_file(struct dentry *, const struct configfs_attribute *, int);
extern void configfs_hash_and_remove(struct dentry * dir, const char * name);
extern const unsigned char * configfs_get_name(struct configfs_dirent *sd);
extern void configfs_drop_dentry(struct configfs_dirent *sd, struct dentry *parent);
extern int configfs_pin_fs(void);
extern void configfs_release_fs(void);
extern struct rw_semaphore configfs_rename_sem;
extern struct super_block * configfs_sb;
extern struct file_operations configfs_dir_operations;
extern struct file_operations configfs_file_operations;
extern struct file_operations bin_fops;
extern struct inode_operations configfs_dir_inode_operations;
extern struct inode_operations configfs_symlink_inode_operations;
extern int configfs_symlink(struct inode *dir, struct dentry *dentry,
const char *symname);
extern int configfs_unlink(struct inode *dir, struct dentry *dentry);
struct configfs_symlink {
struct list_head sl_list;
struct config_item *sl_target;
};
extern int configfs_create_link(struct configfs_symlink *sl,
struct dentry *parent,
struct dentry *dentry);
static inline struct config_item * to_item(struct dentry * dentry)
{
struct configfs_dirent * sd = dentry->d_fsdata;
return ((struct config_item *) sd->s_element);
}
static inline struct configfs_attribute * to_attr(struct dentry * dentry)
{
struct configfs_dirent * sd = dentry->d_fsdata;
return ((struct configfs_attribute *) sd->s_element);
}
static inline struct config_item *configfs_get_config_item(struct dentry *dentry)
{
struct config_item * item = NULL;
spin_lock(&dcache_lock);
if (!d_unhashed(dentry)) {
struct configfs_dirent * sd = dentry->d_fsdata;
if (sd->s_type & CONFIGFS_ITEM_LINK) {
struct configfs_symlink * sl = sd->s_element;
item = config_item_get(sl->sl_target);
} else
item = config_item_get(sd->s_element);
}
spin_unlock(&dcache_lock);
return item;
}
static inline void release_configfs_dirent(struct configfs_dirent * sd)
{
if (!(sd->s_type & CONFIGFS_ROOT))
kfree(sd);
}
static inline struct configfs_dirent * configfs_get(struct configfs_dirent * sd)
{
if (sd) {
WARN_ON(!atomic_read(&sd->s_count));
atomic_inc(&sd->s_count);
}
return sd;
}
static inline void configfs_put(struct configfs_dirent * sd)
{
WARN_ON(!atomic_read(&sd->s_count));
if (atomic_dec_and_test(&sd->s_count))
release_configfs_dirent(sd);
}
This diff is collapsed.
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* file.c - operations for regular (text) files.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*
* Based on sysfs:
* sysfs is Copyright (C) 2001, 2002, 2003 Patrick Mochel
*
* configfs Copyright (C) 2005 Oracle. All rights reserved.
*/
#include <linux/fs.h>
#include <linux/module.h>
#include <linux/dnotify.h>
#include <linux/slab.h>
#include <asm/uaccess.h>
#include <asm/semaphore.h>
#include <linux/configfs.h>
#include "configfs_internal.h"
struct configfs_buffer {
size_t count;
loff_t pos;
char * page;
struct configfs_item_operations * ops;
struct semaphore sem;
int needs_read_fill;
};
/**
* fill_read_buffer - allocate and fill buffer from item.
* @dentry: dentry pointer.
* @buffer: data buffer for file.
*
* Allocate @buffer->page, if it hasn't been already, then call the
* config_item's show() method to fill the buffer with this attribute's
* data.
* This is called only once, on the file's first read.
*/
static int fill_read_buffer(struct dentry * dentry, struct configfs_buffer * buffer)
{
struct configfs_attribute * attr = to_attr(dentry);
struct config_item * item = to_item(dentry->d_parent);
struct configfs_item_operations * ops = buffer->ops;
int ret = 0;
ssize_t count;
if (!buffer->page)
buffer->page = (char *) get_zeroed_page(GFP_KERNEL);
if (!buffer->page)
return -ENOMEM;
count = ops->show_attribute(item,attr,buffer->page);
buffer->needs_read_fill = 0;
BUG_ON(count > (ssize_t)PAGE_SIZE);
if (count >= 0)
buffer->count = count;
else
ret = count;
return ret;
}
/**
* flush_read_buffer - push buffer to userspace.
* @buffer: data buffer for file.
* @userbuf: user-passed buffer.
* @count: number of bytes requested.
* @ppos: file position.
*
* Copy the buffer we filled in fill_read_buffer() to userspace.
* This is done at the reader's leisure, copying and advancing
* the amount they specify each time.
* This may be called continuously until the buffer is empty.
*/
static int flush_read_buffer(struct configfs_buffer * buffer, char __user * buf,
size_t count, loff_t * ppos)
{
int error;
if (*ppos > buffer->count)
return 0;
if (count > (buffer->count - *ppos))
count = buffer->count - *ppos;
error = copy_to_user(buf,buffer->page + *ppos,count);
if (!error)
*ppos += count;
return error ? -EFAULT : count;
}
/**
* configfs_read_file - read an attribute.
* @file: file pointer.
* @buf: buffer to fill.
* @count: number of bytes to read.
* @ppos: starting offset in file.
*
* Userspace wants to read an attribute file. The attribute descriptor
* is in the file's ->d_fsdata. The target item is in the directory's
* ->d_fsdata.
*
* We call fill_read_buffer() to allocate and fill the buffer from the
* item's show() method exactly once (if the read is happening from
* the beginning of the file). That should fill the entire buffer with
* all the data the item has to offer for that attribute.
* We then call flush_read_buffer() to copy the buffer to userspace
* in the increments specified.
*/
static ssize_t
configfs_read_file(struct file *file, char __user *buf, size_t count, loff_t *ppos)
{
struct configfs_buffer * buffer = file->private_data;
ssize_t retval = 0;
down(&buffer->sem);
if (buffer->needs_read_fill) {
if ((retval = fill_read_buffer(file->f_dentry,buffer)))
goto out;
}
pr_debug("%s: count = %d, ppos = %lld, buf = %s\n",
__FUNCTION__,count,*ppos,buffer->page);
retval = flush_read_buffer(buffer,buf,count,ppos);
out:
up(&buffer->sem);
return retval;
}
/**
* fill_write_buffer - copy buffer from userspace.
* @buffer: data buffer for file.
* @userbuf: data from user.
* @count: number of bytes in @userbuf.
*
* Allocate @buffer->page if it hasn't been already, then
* copy the user-supplied buffer into it.
*/
static int
fill_write_buffer(struct configfs_buffer * buffer, const char __user * buf, size_t count)
{
int error;
if (!buffer->page)
buffer->page = (char *)get_zeroed_page(GFP_KERNEL);
if (!buffer->page)
return -ENOMEM;
if (count > PAGE_SIZE)
count = PAGE_SIZE;
error = copy_from_user(buffer->page,buf,count);
buffer->needs_read_fill = 1;
return error ? -EFAULT : count;
}
/**
* flush_write_buffer - push buffer to config_item.
* @file: file pointer.
* @buffer: data buffer for file.
*
* Get the correct pointers for the config_item and the attribute we're
* dealing with, then call the store() method for the attribute,
* passing the buffer that we acquired in fill_write_buffer().
*/
static int
flush_write_buffer(struct dentry * dentry, struct configfs_buffer * buffer, size_t count)
{
struct configfs_attribute * attr = to_attr(dentry);
struct config_item * item = to_item(dentry->d_parent);
struct configfs_item_operations * ops = buffer->ops;
return ops->store_attribute(item,attr,buffer->page,count);
}
/**
* configfs_write_file - write an attribute.
* @file: file pointer
* @buf: data to write
* @count: number of bytes
* @ppos: starting offset
*
* Similar to configfs_read_file(), though working in the opposite direction.
* We allocate and fill the data from the user in fill_write_buffer(),
* then push it to the config_item in flush_write_buffer().
* There is no easy way for us to know if userspace is only doing a partial
* write, so we don't support them. We expect the entire buffer to come
* on the first write.
* Hint: if you're writing a value, first read the file, modify only the
* the value you're changing, then write entire buffer back.
*/
static ssize_t
configfs_write_file(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
{
struct configfs_buffer * buffer = file->private_data;
down(&buffer->sem);
count = fill_write_buffer(buffer,buf,count);
if (count > 0)
count = flush_write_buffer(file->f_dentry,buffer,count);
if (count > 0)
*ppos += count;
up(&buffer->sem);
return count;
}
static int check_perm(struct inode * inode, struct file * file)
{
struct config_item *item = configfs_get_config_item(file->f_dentry->d_parent);
struct configfs_attribute * attr = to_attr(file->f_dentry);
struct configfs_buffer * buffer;
struct configfs_item_operations * ops = NULL;
int error = 0;
if (!item || !attr)
goto Einval;
/* Grab the module reference for this attribute if we have one */
if (!try_module_get(attr->ca_owner)) {
error = -ENODEV;
goto Done;
}
if (item->ci_type)
ops = item->ci_type->ct_item_ops;
else
goto Eaccess;
/* File needs write support.
* The inode's perms must say it's ok,
* and we must have a store method.
*/
if (file->f_mode & FMODE_WRITE) {
if (!(inode->i_mode & S_IWUGO) || !ops->store_attribute)
goto Eaccess;
}
/* File needs read support.
* The inode's perms must say it's ok, and we there
* must be a show method for it.
*/
if (file->f_mode & FMODE_READ) {
if (!(inode->i_mode & S_IRUGO) || !ops->show_attribute)
goto Eaccess;
}
/* No error? Great, allocate a buffer for the file, and store it
* it in file->private_data for easy access.
*/
buffer = kmalloc(sizeof(struct configfs_buffer),GFP_KERNEL);
if (buffer) {
memset(buffer,0,sizeof(struct configfs_buffer));
init_MUTEX(&buffer->sem);
buffer->needs_read_fill = 1;
buffer->ops = ops;
file->private_data = buffer;
} else
error = -ENOMEM;
goto Done;
Einval:
error = -EINVAL;
goto Done;
Eaccess:
error = -EACCES;
module_put(attr->ca_owner);
Done:
if (error && item)
config_item_put(item);
return error;
}
static int configfs_open_file(struct inode * inode, struct file * filp)
{
return check_perm(inode,filp);
}
static int configfs_release(struct inode * inode, struct file * filp)
{
struct config_item * item = to_item(filp->f_dentry->d_parent);
struct configfs_attribute * attr = to_attr(filp->f_dentry);
struct module * owner = attr->ca_owner;
struct configfs_buffer * buffer = filp->private_data;
if (item)
config_item_put(item);
/* After this point, attr should not be accessed. */
module_put(owner);
if (buffer) {
if (buffer->page)
free_page((unsigned long)buffer->page);
kfree(buffer);
}
return 0;
}
struct file_operations configfs_file_operations = {
.read = configfs_read_file,
.write = configfs_write_file,
.llseek = generic_file_llseek,
.open = configfs_open_file,
.release = configfs_release,
};
int configfs_add_file(struct dentry * dir, const struct configfs_attribute * attr, int type)
{
struct configfs_dirent * parent_sd = dir->d_fsdata;
umode_t mode = (attr->ca_mode & S_IALLUGO) | S_IFREG;
int error = 0;
down(&dir->d_inode->i_sem);
error = configfs_make_dirent(parent_sd, NULL, (void *) attr, mode, type);
up(&dir->d_inode->i_sem);
return error;
}
/**
* configfs_create_file - create an attribute file for an item.
* @item: item we're creating for.
* @attr: atrribute descriptor.
*/
int configfs_create_file(struct config_item * item, const struct configfs_attribute * attr)
{
BUG_ON(!item || !item->ci_dentry || !attr);
return configfs_add_file(item->ci_dentry, attr,
CONFIGFS_ITEM_ATTR);
}
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* inode.c - basic inode and dentry operations.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*
* Based on sysfs:
* sysfs is Copyright (C) 2001, 2002, 2003 Patrick Mochel
*
* configfs Copyright (C) 2005 Oracle. All rights reserved.
*
* Please see Documentation/filesystems/configfs.txt for more information.
*/
#undef DEBUG
#include <linux/pagemap.h>
#include <linux/namei.h>
#include <linux/backing-dev.h>
#include <linux/configfs.h>
#include "configfs_internal.h"
extern struct super_block * configfs_sb;
static struct address_space_operations configfs_aops = {
.readpage = simple_readpage,
.prepare_write = simple_prepare_write,
.commit_write = simple_commit_write
};
static struct backing_dev_info configfs_backing_dev_info = {
.ra_pages = 0, /* No readahead */
.capabilities = BDI_CAP_NO_ACCT_DIRTY | BDI_CAP_NO_WRITEBACK,
};
struct inode * configfs_new_inode(mode_t mode)
{
struct inode * inode = new_inode(configfs_sb);
if (inode) {
inode->i_mode = mode;
inode->i_uid = 0;
inode->i_gid = 0;
inode->i_blksize = PAGE_CACHE_SIZE;
inode->i_blocks = 0;
inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
inode->i_mapping->a_ops = &configfs_aops;
inode->i_mapping->backing_dev_info = &configfs_backing_dev_info;
}
return inode;
}
int configfs_create(struct dentry * dentry, int mode, int (*init)(struct inode *))
{
int error = 0;
struct inode * inode = NULL;
if (dentry) {
if (!dentry->d_inode) {
if ((inode = configfs_new_inode(mode))) {
if (dentry->d_parent && dentry->d_parent->d_inode) {
struct inode *p_inode = dentry->d_parent->d_inode;
p_inode->i_mtime = p_inode->i_ctime = CURRENT_TIME;
}
goto Proceed;
}
else
error = -ENOMEM;
} else
error = -EEXIST;
} else
error = -ENOENT;
goto Done;
Proceed:
if (init)
error = init(inode);
if (!error) {
d_instantiate(dentry, inode);
if (S_ISDIR(mode) || S_ISLNK(mode))
dget(dentry); /* pin link and directory dentries in core */
} else
iput(inode);
Done:
return error;
}
/*
* Get the name for corresponding element represented by the given configfs_dirent
*/
const unsigned char * configfs_get_name(struct configfs_dirent *sd)
{
struct attribute * attr;
if (!sd || !sd->s_element)
BUG();
/* These always have a dentry, so use that */
if (sd->s_type & (CONFIGFS_DIR | CONFIGFS_ITEM_LINK))
return sd->s_dentry->d_name.name;
if (sd->s_type & CONFIGFS_ITEM_ATTR) {
attr = sd->s_element;
return attr->name;
}
return NULL;
}
/*
* Unhashes the dentry corresponding to given configfs_dirent
* Called with parent inode's i_sem held.
*/
void configfs_drop_dentry(struct configfs_dirent * sd, struct dentry * parent)
{
struct dentry * dentry = sd->s_dentry;
if (dentry) {
spin_lock(&dcache_lock);
if (!(d_unhashed(dentry) && dentry->d_inode)) {
dget_locked(dentry);
__d_drop(dentry);
spin_unlock(&dcache_lock);
simple_unlink(parent->d_inode, dentry);
} else
spin_unlock(&dcache_lock);
}
}
void configfs_hash_and_remove(struct dentry * dir, const char * name)
{
struct configfs_dirent * sd;
struct configfs_dirent * parent_sd = dir->d_fsdata;
down(&dir->d_inode->i_sem);
list_for_each_entry(sd, &parent_sd->s_children, s_sibling) {
if (!sd->s_element)
continue;
if (!strcmp(configfs_get_name(sd), name)) {
list_del_init(&sd->s_sibling);
configfs_drop_dentry(sd, dir);
configfs_put(sd);
break;
}
}
up(&dir->d_inode->i_sem);
}
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* item.c - library routines for handling generic config items
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*
* Based on kobject:
* kobject is Copyright (c) 2002-2003 Patrick Mochel
*
* configfs Copyright (C) 2005 Oracle. All rights reserved.
*
* Please see the file Documentation/filesystems/configfs.txt for
* critical information about using the config_item interface.
*/
#include <linux/string.h>
#include <linux/module.h>
#include <linux/stat.h>
#include <linux/slab.h>
#include <linux/configfs.h>
static inline struct config_item * to_item(struct list_head * entry)
{
return container_of(entry,struct config_item,ci_entry);
}
/* Evil kernel */
static void config_item_release(struct kref *kref);
/**
* config_item_init - initialize item.
* @item: item in question.
*/
void config_item_init(struct config_item * item)
{
kref_init(&item->ci_kref);
INIT_LIST_HEAD(&item->ci_entry);
}
/**
* config_item_set_name - Set the name of an item
* @item: item.
* @name: name.
*
* If strlen(name) >= CONFIGFS_ITEM_NAME_LEN, then use a
* dynamically allocated string that @item->ci_name points to.
* Otherwise, use the static @item->ci_namebuf array.
*/
int config_item_set_name(struct config_item * item, const char * fmt, ...)
{
int error = 0;
int limit = CONFIGFS_ITEM_NAME_LEN;
int need;
va_list args;
char * name;
/*
* First, try the static array
*/
va_start(args,fmt);
need = vsnprintf(item->ci_namebuf,limit,fmt,args);
va_end(args);
if (need < limit)
name = item->ci_namebuf;
else {
/*
* Need more space? Allocate it and try again
*/
limit = need + 1;
name = kmalloc(limit,GFP_KERNEL);
if (!name) {
error = -ENOMEM;
goto Done;
}
va_start(args,fmt);
need = vsnprintf(name,limit,fmt,args);
va_end(args);
/* Still? Give up. */
if (need >= limit) {
kfree(name);
error = -EFAULT;
goto Done;
}
}
/* Free the old name, if necessary. */
if (item->ci_name && item->ci_name != item->ci_namebuf)
kfree(item->ci_name);
/* Now, set the new name */
item->ci_name = name;
Done:
return error;
}
EXPORT_SYMBOL(config_item_set_name);
void config_item_init_type_name(struct config_item *item,
const char *name,
struct config_item_type *type)
{
config_item_set_name(item, name);
item->ci_type = type;
config_item_init(item);
}
EXPORT_SYMBOL(config_item_init_type_name);
void config_group_init_type_name(struct config_group *group, const char *name,
struct config_item_type *type)
{
config_item_set_name(&group->cg_item, name);
group->cg_item.ci_type = type;
config_group_init(group);
}
EXPORT_SYMBOL(config_group_init_type_name);
struct config_item * config_item_get(struct config_item * item)
{
if (item)
kref_get(&item->ci_kref);
return item;
}
/**
* config_item_cleanup - free config_item resources.
* @item: item.
*/
void config_item_cleanup(struct config_item * item)
{
struct config_item_type * t = item->ci_type;
struct config_group * s = item->ci_group;
struct config_item * parent = item->ci_parent;
pr_debug("config_item %s: cleaning up\n",config_item_name(item));
if (item->ci_name != item->ci_namebuf)
kfree(item->ci_name);
item->ci_name = NULL;
if (t && t->ct_item_ops && t->ct_item_ops->release)
t->ct_item_ops->release(item);
if (s)
config_group_put(s);
if (parent)
config_item_put(parent);
}
static void config_item_release(struct kref *kref)
{
config_item_cleanup(container_of(kref, struct config_item, ci_kref));
}
/**
* config_item_put - decrement refcount for item.
* @item: item.
*
* Decrement the refcount, and if 0, call config_item_cleanup().
*/
void config_item_put(struct config_item * item)
{
if (item)
kref_put(&item->ci_kref, config_item_release);
}
/**
* config_group_init - initialize a group for use
* @k: group
*/
void config_group_init(struct config_group *group)
{
config_item_init(&group->cg_item);
INIT_LIST_HEAD(&group->cg_children);
}
/**
* config_group_find_obj - search for item in group.
* @group: group we're looking in.
* @name: item's name.
*
* Lock group via @group->cg_subsys, and iterate over @group->cg_list,
* looking for a matching config_item. If matching item is found
* take a reference and return the item.
*/
struct config_item * config_group_find_obj(struct config_group * group, const char * name)
{
struct list_head * entry;
struct config_item * ret = NULL;
/* XXX LOCKING! */
list_for_each(entry,&group->cg_children) {
struct config_item * item = to_item(entry);
if (config_item_name(item) &&
!strcmp(config_item_name(item), name)) {
ret = config_item_get(item);
break;
}
}
return ret;
}
EXPORT_SYMBOL(config_item_init);
EXPORT_SYMBOL(config_group_init);
EXPORT_SYMBOL(config_item_get);
EXPORT_SYMBOL(config_item_put);
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* mount.c - operations for initializing and mounting configfs.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*
* Based on sysfs:
* sysfs is Copyright (C) 2001, 2002, 2003 Patrick Mochel
*
* configfs Copyright (C) 2005 Oracle. All rights reserved.
*/
#include <linux/fs.h>
#include <linux/module.h>
#include <linux/mount.h>
#include <linux/pagemap.h>
#include <linux/init.h>
#include <linux/configfs.h>
#include "configfs_internal.h"
/* Random magic number */
#define CONFIGFS_MAGIC 0x62656570
struct vfsmount * configfs_mount = NULL;
struct super_block * configfs_sb = NULL;
static int configfs_mnt_count = 0;
static struct super_operations configfs_ops = {
.statfs = simple_statfs,
.drop_inode = generic_delete_inode,
};
static struct config_group configfs_root_group = {
.cg_item = {
.ci_namebuf = "root",
.ci_name = configfs_root_group.cg_item.ci_namebuf,
},
};
int configfs_is_root(struct config_item *item)
{
return item == &configfs_root_group.cg_item;
}
static struct configfs_dirent configfs_root = {
.s_sibling = LIST_HEAD_INIT(configfs_root.s_sibling),
.s_children = LIST_HEAD_INIT(configfs_root.s_children),
.s_element = &configfs_root_group.cg_item,
.s_type = CONFIGFS_ROOT,
};
static int configfs_fill_super(struct super_block *sb, void *data, int silent)
{
struct inode *inode;
struct dentry *root;
sb->s_blocksize = PAGE_CACHE_SIZE;
sb->s_blocksize_bits = PAGE_CACHE_SHIFT;
sb->s_magic = CONFIGFS_MAGIC;
sb->s_op = &configfs_ops;
configfs_sb = sb;
inode = configfs_new_inode(S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO);
if (inode) {
inode->i_op = &configfs_dir_inode_operations;
inode->i_fop = &configfs_dir_operations;
/* directory inodes start off with i_nlink == 2 (for "." entry) */
inode->i_nlink++;
} else {
pr_debug("configfs: could not get root inode\n");
return -ENOMEM;
}
root = d_alloc_root(inode);
if (!root) {
pr_debug("%s: could not get root dentry!\n",__FUNCTION__);
iput(inode);
return -ENOMEM;
}
config_group_init(&configfs_root_group);
configfs_root_group.cg_item.ci_dentry = root;
root->d_fsdata = &configfs_root;
sb->s_root = root;
return 0;
}
static struct super_block *configfs_get_sb(struct file_system_type *fs_type,
int flags, const char *dev_name, void *data)
{
return get_sb_single(fs_type, flags, data, configfs_fill_super);
}
static struct file_system_type configfs_fs_type = {
.owner = THIS_MODULE,
.name = "configfs",
.get_sb = configfs_get_sb,
.kill_sb = kill_litter_super,
};
int configfs_pin_fs(void)
{
return simple_pin_fs("configfs", &configfs_mount,
&configfs_mnt_count);
}
void configfs_release_fs(void)
{
simple_release_fs(&configfs_mount, &configfs_mnt_count);
}
static decl_subsys(config, NULL, NULL);
static int __init configfs_init(void)
{
int err;
kset_set_kset_s(&config_subsys, kernel_subsys);
err = subsystem_register(&config_subsys);
if (err)
return err;
err = register_filesystem(&configfs_fs_type);
if (err) {
printk(KERN_ERR "configfs: Unable to register filesystem!\n");
subsystem_unregister(&config_subsys);
}
return err;
}
static void __exit configfs_exit(void)
{
unregister_filesystem(&configfs_fs_type);
subsystem_unregister(&config_subsys);
}
MODULE_AUTHOR("Oracle");
MODULE_LICENSE("GPL");
MODULE_VERSION("0.0.1");
MODULE_DESCRIPTION("Simple RAM filesystem for user driven kernel subsystem configuration.");
module_init(configfs_init);
module_exit(configfs_exit);
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* symlink.c - operations for configfs symlinks.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*
* Based on sysfs:
* sysfs is Copyright (C) 2001, 2002, 2003 Patrick Mochel
*
* configfs Copyright (C) 2005 Oracle. All rights reserved.
*/
#include <linux/fs.h>
#include <linux/module.h>
#include <linux/namei.h>
#include <linux/configfs.h>
#include "configfs_internal.h"
static int item_depth(struct config_item * item)
{
struct config_item * p = item;
int depth = 0;
do { depth++; } while ((p = p->ci_parent) && !configfs_is_root(p));
return depth;
}
static int item_path_length(struct config_item * item)
{
struct config_item * p = item;
int length = 1;
do {
length += strlen(config_item_name(p)) + 1;
p = p->ci_parent;
} while (p && !configfs_is_root(p));
return length;
}
static void fill_item_path(struct config_item * item, char * buffer, int length)
{
struct config_item * p;
--length;
for (p = item; p && !configfs_is_root(p); p = p->ci_parent) {
int cur = strlen(config_item_name(p));
/* back up enough to print this bus id with '/' */
length -= cur;
strncpy(buffer + length,config_item_name(p),cur);
*(buffer + --length) = '/';
}
}
static int create_link(struct config_item *parent_item,
struct config_item *item,
struct dentry *dentry)
{
struct configfs_dirent *target_sd = item->ci_dentry->d_fsdata;
struct configfs_symlink *sl;
int ret;
ret = -ENOMEM;
sl = kmalloc(sizeof(struct configfs_symlink), GFP_KERNEL);
if (sl) {
sl->sl_target = config_item_get(item);
/* FIXME: needs a lock, I'd bet */
list_add(&sl->sl_list, &target_sd->s_links);
ret = configfs_create_link(sl, parent_item->ci_dentry,
dentry);
if (ret) {
list_del_init(&sl->sl_list);
config_item_put(item);
kfree(sl);
}
}
return ret;
}
static int get_target(const char *symname, struct nameidata *nd,
struct config_item **target)
{
int ret;
ret = path_lookup(symname, LOOKUP_FOLLOW|LOOKUP_DIRECTORY, nd);
if (!ret) {
if (nd->dentry->d_sb == configfs_sb) {
*target = configfs_get_config_item(nd->dentry);
if (!*target) {
ret = -ENOENT;
path_release(nd);
}
} else
ret = -EPERM;
}
return ret;
}
int configfs_symlink(struct inode *dir, struct dentry *dentry, const char *symname)
{
int ret;
struct nameidata nd;
struct config_item *parent_item;
struct config_item *target_item;
struct config_item_type *type;
ret = -EPERM; /* What lack-of-symlink returns */
if (dentry->d_parent == configfs_sb->s_root)
goto out;
parent_item = configfs_get_config_item(dentry->d_parent);
type = parent_item->ci_type;
if (!type || !type->ct_item_ops ||
!type->ct_item_ops->allow_link)
goto out_put;
ret = get_target(symname, &nd, &target_item);
if (ret)
goto out_put;
ret = type->ct_item_ops->allow_link(parent_item, target_item);
if (!ret)
ret = create_link(parent_item, target_item, dentry);
config_item_put(target_item);
path_release(&nd);
out_put:
config_item_put(parent_item);
out:
return ret;
}
int configfs_unlink(struct inode *dir, struct dentry *dentry)
{
struct configfs_dirent *sd = dentry->d_fsdata;
struct configfs_symlink *sl;
struct config_item *parent_item;
struct config_item_type *type;
int ret;
ret = -EPERM; /* What lack-of-symlink returns */
if (!(sd->s_type & CONFIGFS_ITEM_LINK))
goto out;
if (dentry->d_parent == configfs_sb->s_root)
BUG();
sl = sd->s_element;
parent_item = configfs_get_config_item(dentry->d_parent);
type = parent_item->ci_type;
list_del_init(&sd->s_sibling);
configfs_drop_dentry(sd, dentry->d_parent);
dput(dentry);
configfs_put(sd);
/*
* drop_link() must be called before
* list_del_init(&sl->sl_list), so that the order of
* drop_link(this, target) and drop_item(target) is preserved.
*/
if (type && type->ct_item_ops &&
type->ct_item_ops->drop_link)
type->ct_item_ops->drop_link(parent_item,
sl->sl_target);
/* FIXME: Needs lock */
list_del_init(&sl->sl_list);
/* Put reference from create_link() */
config_item_put(sl->sl_target);
kfree(sl);
config_item_put(parent_item);
ret = 0;
out:
return ret;
}
static int configfs_get_target_path(struct config_item * item, struct config_item * target,
char *path)
{
char * s;
int depth, size;
depth = item_depth(item);
size = item_path_length(target) + depth * 3 - 1;
if (size > PATH_MAX)
return -ENAMETOOLONG;
pr_debug("%s: depth = %d, size = %d\n", __FUNCTION__, depth, size);
for (s = path; depth--; s += 3)
strcpy(s,"../");
fill_item_path(target, path, size);
pr_debug("%s: path = '%s'\n", __FUNCTION__, path);
return 0;
}
static int configfs_getlink(struct dentry *dentry, char * path)
{
struct config_item *item, *target_item;
int error = 0;
item = configfs_get_config_item(dentry->d_parent);
if (!item)
return -EINVAL;
target_item = configfs_get_config_item(dentry);
if (!target_item) {
config_item_put(item);
return -EINVAL;
}
down_read(&configfs_rename_sem);
error = configfs_get_target_path(item, target_item, path);
up_read(&configfs_rename_sem);
config_item_put(item);
config_item_put(target_item);
return error;
}
static void *configfs_follow_link(struct dentry *dentry, struct nameidata *nd)
{
int error = -ENOMEM;
unsigned long page = get_zeroed_page(GFP_KERNEL);
if (page) {
error = configfs_getlink(dentry, (char *)page);
if (!error) {
nd_set_link(nd, (char *)page);
return (void *)page;
}
}
nd_set_link(nd, ERR_PTR(error));
return NULL;
}
static void configfs_put_link(struct dentry *dentry, struct nameidata *nd,
void *cookie)
{
if (cookie) {
unsigned long page = (unsigned long)cookie;
free_page(page);
}
}
struct inode_operations configfs_symlink_inode_operations = {
.follow_link = configfs_follow_link,
.readlink = generic_readlink,
.put_link = configfs_put_link,
};
......@@ -721,7 +721,7 @@ mpage_writepages(struct address_space *mapping,
&last_block_in_bio, &ret, wbc,
page->mapping->a_ops->writepage);
}
if (unlikely(ret == WRITEPAGE_ACTIVATE))
if (unlikely(ret == AOP_WRITEPAGE_ACTIVATE))
unlock_page(page);
if (ret || (--(wbc->nr_to_write) <= 0))
done = 1;
......
EXTRA_CFLAGS += -Ifs/ocfs2
EXTRA_CFLAGS += -DCATCH_BH_JBD_RACES
obj-$(CONFIG_OCFS2_FS) += ocfs2.o
ocfs2-objs := \
alloc.o \
aops.o \
buffer_head_io.o \
dcache.o \
dir.o \
dlmglue.o \
export.o \
extent_map.o \
file.o \
heartbeat.o \
inode.o \
journal.o \
localalloc.o \
mmap.o \
namei.o \
slot_map.o \
suballoc.o \
super.o \
symlink.o \
sysfile.o \
uptodate.o \
ver.o \
vote.o
obj-$(CONFIG_OCFS2_FS) += cluster/
obj-$(CONFIG_OCFS2_FS) += dlm/
This diff is collapsed.
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* alloc.h
*
* Function prototypes
*
* Copyright (C) 2002, 2004 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef OCFS2_ALLOC_H
#define OCFS2_ALLOC_H
struct ocfs2_alloc_context;
int ocfs2_insert_extent(struct ocfs2_super *osb,
struct ocfs2_journal_handle *handle,
struct inode *inode,
struct buffer_head *fe_bh,
u64 blkno,
u32 new_clusters,
struct ocfs2_alloc_context *meta_ac);
int ocfs2_num_free_extents(struct ocfs2_super *osb,
struct inode *inode,
struct ocfs2_dinode *fe);
/* how many new metadata chunks would an allocation need at maximum? */
static inline int ocfs2_extend_meta_needed(struct ocfs2_dinode *fe)
{
/*
* Rather than do all the work of determining how much we need
* (involves a ton of reads and locks), just ask for the
* maximal limit. That's a tree depth shift. So, one block for
* level of the tree (current l_tree_depth), one block for the
* new tree_depth==0 extent_block, and one block at the new
* top-of-the tree.
*/
return le16_to_cpu(fe->id2.i_list.l_tree_depth) + 2;
}
int ocfs2_truncate_log_init(struct ocfs2_super *osb);
void ocfs2_truncate_log_shutdown(struct ocfs2_super *osb);
void ocfs2_schedule_truncate_log_flush(struct ocfs2_super *osb,
int cancel);
int ocfs2_flush_truncate_log(struct ocfs2_super *osb);
int ocfs2_begin_truncate_log_recovery(struct ocfs2_super *osb,
int slot_num,
struct ocfs2_dinode **tl_copy);
int ocfs2_complete_truncate_log_recovery(struct ocfs2_super *osb,
struct ocfs2_dinode *tl_copy);
struct ocfs2_truncate_context {
struct inode *tc_ext_alloc_inode;
struct buffer_head *tc_ext_alloc_bh;
int tc_ext_alloc_locked; /* is it cluster locked? */
/* these get destroyed once it's passed to ocfs2_commit_truncate. */
struct buffer_head *tc_last_eb_bh;
};
int ocfs2_prepare_truncate(struct ocfs2_super *osb,
struct inode *inode,
struct buffer_head *fe_bh,
struct ocfs2_truncate_context **tc);
int ocfs2_commit_truncate(struct ocfs2_super *osb,
struct inode *inode,
struct buffer_head *fe_bh,
struct ocfs2_truncate_context *tc);
#endif /* OCFS2_ALLOC_H */
This diff is collapsed.
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* Copyright (C) 2002, 2004, 2005 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef OCFS2_AOPS_H
#define OCFS2_AOPS_H
int ocfs2_prepare_write(struct file *file, struct page *page,
unsigned from, unsigned to);
struct ocfs2_journal_handle *ocfs2_start_walk_page_trans(struct inode *inode,
struct page *page,
unsigned from,
unsigned to);
/* all ocfs2_dio_end_io()'s fault */
#define ocfs2_iocb_is_rw_locked(iocb) \
test_bit(0, (unsigned long *)&iocb->private)
#define ocfs2_iocb_set_rw_locked(iocb) \
set_bit(0, (unsigned long *)&iocb->private)
#define ocfs2_iocb_clear_rw_locked(iocb) \
clear_bit(0, (unsigned long *)&iocb->private)
#endif /* OCFS2_FILE_H */
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* io.c
*
* Buffer cache handling
*
* Copyright (C) 2002, 2004 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include <linux/fs.h>
#include <linux/types.h>
#include <linux/slab.h>
#include <linux/highmem.h>
#include <cluster/masklog.h>
#include "ocfs2.h"
#include "alloc.h"
#include "inode.h"
#include "journal.h"
#include "uptodate.h"
#include "buffer_head_io.h"
int ocfs2_write_block(struct ocfs2_super *osb, struct buffer_head *bh,
struct inode *inode)
{
int ret = 0;
mlog_entry("(bh->b_blocknr = %llu, inode=%p)\n",
(unsigned long long)bh->b_blocknr, inode);
BUG_ON(bh->b_blocknr < OCFS2_SUPER_BLOCK_BLKNO);
BUG_ON(buffer_jbd(bh));
/* No need to check for a soft readonly file system here. non
* journalled writes are only ever done on system files which
* can get modified during recovery even if read-only. */
if (ocfs2_is_hard_readonly(osb)) {
ret = -EROFS;
goto out;
}
down(&OCFS2_I(inode)->ip_io_sem);
lock_buffer(bh);
set_buffer_uptodate(bh);
/* remove from dirty list before I/O. */
clear_buffer_dirty(bh);
get_bh(bh); /* for end_buffer_write_sync() */
bh->b_end_io = end_buffer_write_sync;
submit_bh(WRITE, bh);
wait_on_buffer(bh);
if (buffer_uptodate(bh)) {
ocfs2_set_buffer_uptodate(inode, bh);
} else {
/* We don't need to remove the clustered uptodate
* information for this bh as it's not marked locally
* uptodate. */
ret = -EIO;
brelse(bh);
}
up(&OCFS2_I(inode)->ip_io_sem);
out:
mlog_exit(ret);
return ret;
}
int ocfs2_read_blocks(struct ocfs2_super *osb, u64 block, int nr,
struct buffer_head *bhs[], int flags,
struct inode *inode)
{
int status = 0;
struct super_block *sb;
int i, ignore_cache = 0;
struct buffer_head *bh;
mlog_entry("(block=(%"MLFu64"), nr=(%d), flags=%d, inode=%p)\n",
block, nr, flags, inode);
if (osb == NULL || osb->sb == NULL || bhs == NULL) {
status = -EINVAL;
mlog_errno(status);
goto bail;
}
if (nr < 0) {
mlog(ML_ERROR, "asked to read %d blocks!\n", nr);
status = -EINVAL;
mlog_errno(status);
goto bail;
}
if (nr == 0) {
mlog(ML_BH_IO, "No buffers will be read!\n");
status = 0;
goto bail;
}
sb = osb->sb;
if (flags & OCFS2_BH_CACHED && !inode)
flags &= ~OCFS2_BH_CACHED;
if (inode)
down(&OCFS2_I(inode)->ip_io_sem);
for (i = 0 ; i < nr ; i++) {
if (bhs[i] == NULL) {
bhs[i] = sb_getblk(sb, block++);
if (bhs[i] == NULL) {
if (inode)
up(&OCFS2_I(inode)->ip_io_sem);
status = -EIO;
mlog_errno(status);
goto bail;
}
}
bh = bhs[i];
ignore_cache = 0;
if (flags & OCFS2_BH_CACHED &&
!ocfs2_buffer_uptodate(inode, bh)) {
mlog(ML_UPTODATE,
"bh (%llu), inode %"MLFu64" not uptodate\n",
(unsigned long long)bh->b_blocknr,
OCFS2_I(inode)->ip_blkno);
ignore_cache = 1;
}
/* XXX: Can we ever get this and *not* have the cached
* flag set? */
if (buffer_jbd(bh)) {
if (!(flags & OCFS2_BH_CACHED) || ignore_cache)
mlog(ML_BH_IO, "trying to sync read a jbd "
"managed bh (blocknr = %llu)\n",
(unsigned long long)bh->b_blocknr);
continue;
}
if (!(flags & OCFS2_BH_CACHED) || ignore_cache) {
if (buffer_dirty(bh)) {
/* This should probably be a BUG, or
* at least return an error. */
mlog(ML_BH_IO, "asking me to sync read a dirty "
"buffer! (blocknr = %llu)\n",
(unsigned long long)bh->b_blocknr);
continue;
}
lock_buffer(bh);
if (buffer_jbd(bh)) {
#ifdef CATCH_BH_JBD_RACES
mlog(ML_ERROR, "block %llu had the JBD bit set "
"while I was in lock_buffer!",
(unsigned long long)bh->b_blocknr);
BUG();
#else
unlock_buffer(bh);
continue;
#endif
}
clear_buffer_uptodate(bh);
get_bh(bh); /* for end_buffer_read_sync() */
bh->b_end_io = end_buffer_read_sync;
if (flags & OCFS2_BH_READAHEAD)
submit_bh(READA, bh);
else
submit_bh(READ, bh);
continue;
}
}
status = 0;
for (i = (nr - 1); i >= 0; i--) {
bh = bhs[i];
/* We know this can't have changed as we hold the
* inode sem. Avoid doing any work on the bh if the
* journal has it. */
if (!buffer_jbd(bh))
wait_on_buffer(bh);
if (!buffer_uptodate(bh)) {
/* Status won't be cleared from here on out,
* so we can safely record this and loop back
* to cleanup the other buffers. Don't need to
* remove the clustered uptodate information
* for this bh as it's not marked locally
* uptodate. */
status = -EIO;
brelse(bh);
bhs[i] = NULL;
continue;
}
if (inode)
ocfs2_set_buffer_uptodate(inode, bh);
}
if (inode)
up(&OCFS2_I(inode)->ip_io_sem);
mlog(ML_BH_IO, "block=(%"MLFu64"), nr=(%d), cached=%s\n", block, nr,
(!(flags & OCFS2_BH_CACHED) || ignore_cache) ? "no" : "yes");
bail:
mlog_exit(status);
return status;
}
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* ocfs2_buffer_head.h
*
* Buffer cache handling functions defined
*
* Copyright (C) 2002, 2004 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef OCFS2_BUFFER_HEAD_IO_H
#define OCFS2_BUFFER_HEAD_IO_H
#include <linux/buffer_head.h>
void ocfs2_end_buffer_io_sync(struct buffer_head *bh,
int uptodate);
static inline int ocfs2_read_block(struct ocfs2_super *osb,
u64 off,
struct buffer_head **bh,
int flags,
struct inode *inode);
int ocfs2_write_block(struct ocfs2_super *osb,
struct buffer_head *bh,
struct inode *inode);
int ocfs2_read_blocks(struct ocfs2_super *osb,
u64 block,
int nr,
struct buffer_head *bhs[],
int flags,
struct inode *inode);
#define OCFS2_BH_CACHED 1
#define OCFS2_BH_READAHEAD 8 /* use this to pass READA down to submit_bh */
static inline int ocfs2_read_block(struct ocfs2_super * osb, u64 off,
struct buffer_head **bh, int flags,
struct inode *inode)
{
int status = 0;
if (bh == NULL) {
printk("ocfs2: bh == NULL\n");
status = -EINVAL;
goto bail;
}
status = ocfs2_read_blocks(osb, off, 1, bh,
flags, inode);
bail:
return status;
}
#endif /* OCFS2_BUFFER_HEAD_IO_H */
obj-$(CONFIG_OCFS2_FS) += ocfs2_nodemanager.o
ocfs2_nodemanager-objs := heartbeat.o masklog.o sys.o nodemanager.o \
quorum.o tcp.o ver.o
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* Copyright (C) 2005 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef OCFS2_CLUSTER_ENDIAN_H
#define OCFS2_CLUSTER_ENDIAN_H
static inline void be32_add_cpu(__be32 *var, u32 val)
{
*var = cpu_to_be32(be32_to_cpu(*var) + val);
}
#endif /* OCFS2_CLUSTER_ENDIAN_H */
This diff is collapsed.
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* heartbeat.h
*
* Function prototypes
*
* Copyright (C) 2004 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*
*/
#ifndef O2CLUSTER_HEARTBEAT_H
#define O2CLUSTER_HEARTBEAT_H
#include "ocfs2_heartbeat.h"
#define O2HB_REGION_TIMEOUT_MS 2000
/* number of changes to be seen as live */
#define O2HB_LIVE_THRESHOLD 2
/* number of equal samples to be seen as dead */
extern unsigned int o2hb_dead_threshold;
#define O2HB_DEFAULT_DEAD_THRESHOLD 7
/* Otherwise MAX_WRITE_TIMEOUT will be zero... */
#define O2HB_MIN_DEAD_THRESHOLD 2
#define O2HB_MAX_WRITE_TIMEOUT_MS (O2HB_REGION_TIMEOUT_MS * (o2hb_dead_threshold - 1))
#define O2HB_CB_MAGIC 0x51d1e4ec
/* callback stuff */
enum o2hb_callback_type {
O2HB_NODE_DOWN_CB = 0,
O2HB_NODE_UP_CB,
O2HB_NUM_CB
};
struct o2nm_node;
typedef void (o2hb_cb_func)(struct o2nm_node *, int, void *);
struct o2hb_callback_func {
u32 hc_magic;
struct list_head hc_item;
o2hb_cb_func *hc_func;
void *hc_data;
int hc_priority;
enum o2hb_callback_type hc_type;
};
struct config_group *o2hb_alloc_hb_set(void);
void o2hb_free_hb_set(struct config_group *group);
void o2hb_setup_callback(struct o2hb_callback_func *hc,
enum o2hb_callback_type type,
o2hb_cb_func *func,
void *data,
int priority);
int o2hb_register_callback(struct o2hb_callback_func *hc);
int o2hb_unregister_callback(struct o2hb_callback_func *hc);
void o2hb_fill_node_map(unsigned long *map,
unsigned bytes);
void o2hb_init(void);
int o2hb_check_node_heartbeating(u8 node_num);
int o2hb_check_node_heartbeating_from_callback(u8 node_num);
int o2hb_check_local_node_heartbeating(void);
void o2hb_stop_all_regions(void);
#endif /* O2CLUSTER_HEARTBEAT_H */
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
EXTRA_CFLAGS += -Ifs/ocfs2
obj-$(CONFIG_OCFS2_FS) += ocfs2_dlm.o ocfs2_dlmfs.o
ocfs2_dlm-objs := dlmdomain.o dlmdebug.o dlmthread.o dlmrecovery.o \
dlmmaster.o dlmast.o dlmconvert.o dlmlock.o dlmunlock.o dlmver.o
ocfs2_dlmfs-objs := userdlm.o dlmfs.o dlmfsver.o
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
#ifndef OCFS2_MMAP_H
#define OCFS2_MMAP_H
int ocfs2_mmap(struct file *file, struct vm_area_struct *vma);
#endif /* OCFS2_MMAP_H */
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment