Commit 2829a935 authored by Christoph Hellwig's avatar Christoph Hellwig Committed by James Bottomley

[PATCH] remove LVM1 leftovers from the tree

Now that the devicemapper hit the tree there's no more reason
to keep the uncompiling LVM1 code around and it's various hacks
to other files around, this patch removes it.
parent adf283f2
...@@ -28,8 +28,6 @@ IO-mapping.txt ...@@ -28,8 +28,6 @@ IO-mapping.txt
- how to access I/O mapped memory from within device drivers. - how to access I/O mapped memory from within device drivers.
IRQ-affinity.txt IRQ-affinity.txt
- how to select which CPU(s) handle which interrupt events on SMP. - how to select which CPU(s) handle which interrupt events on SMP.
LVM-HOWTO
- info on setting up logical volume management (virtual disks etc.)
README.DAC960 README.DAC960
- info on Mylex DAC960/DAC1100 PCI RAID Controller Driver for Linux - info on Mylex DAC960/DAC1100 PCI RAID Controller Driver for Linux
README.moxa README.moxa
......
...@@ -114,9 +114,6 @@ Architectural changes ...@@ -114,9 +114,6 @@ Architectural changes
DevFS is now in the kernel. See Documentation/filesystems/devfs/* in DevFS is now in the kernel. See Documentation/filesystems/devfs/* in
the kernel source tree for all the gory details. the kernel source tree for all the gory details.
The Logical Volume Manager (LVM) is now in the kernel. If you want to
use this, you'll need to install the necessary LVM toolset.
32-bit UID support is now in place. Have fun! 32-bit UID support is now in place. Have fun!
Linux documentation for functions is transitioning to inline Linux documentation for functions is transitioning to inline
...@@ -330,10 +327,6 @@ Xfsprogs ...@@ -330,10 +327,6 @@ Xfsprogs
-------- --------
o <ftp://oss.sgi.com/projects/xfs/download/cmd_tars/xfsprogs-2.1.0.src.tar.gz> o <ftp://oss.sgi.com/projects/xfs/download/cmd_tars/xfsprogs-2.1.0.src.tar.gz>
LVM toolset
-----------
o <http://www.sistina.com/lvm/>
Pcmcia-cs Pcmcia-cs
--------- ---------
o <ftp://pcmcia-cs.sourceforge.net/pub/pcmcia-cs/pcmcia-cs-3.1.21.tar.gz> o <ftp://pcmcia-cs.sourceforge.net/pub/pcmcia-cs/pcmcia-cs-3.1.21.tar.gz>
......
Heinz Mauelshagen's LVM (Logical Volume Manager) howto. 02/10/1999
Abstract:
---------
The LVM adds a kind of virtual disks and virtual partitions functionality
to the Linux operating system.
It achieves this by adding an additional layer between the physical peripherals
and the i/o interface in the kernel.
This allows the concatenation of several disk partitions or total disks
(so-called physical volumes or PVs) or even multiple devices
to form a storage pool (so-called Volume Group or VG) with
allocation units called physical extents (called PE).
You can think of the volume group as a virtual disk.
Please see scenario below.
Some or all PEs of this VG then can be allocated to so-called Logical Volumes
or LVs in units called logical extents or LEs.
Each LE is mapped to a corresponding PE.
LEs and PEs are equal in size.
Logical volumes are a kind of virtual partitions.
The LVs can be used through device special files similar to the known
/dev/sd[a-z]* or /dev/hd[a-z]* named /dev/VolumeGroupName/LogicalVolumeName.
But going beyond this, you are able to extend or reduce
VGs _AND_ LVs at runtime!
So...
If for example the capacity of a LV gets too small and your VG containing
this LV is full, you could add another PV to that VG and simply extend
the LV afterwards.
If you reduce or delete a LV you can use the freed capacity for different
LVs in the same VG.
The above scenario looks like this:
/------------------------------------------\
| /--PV2---\ VG 1 /--PVn---\ |
| |-VGDA---| |-VGDA-- | |
| |PE1PE2..| |PE1PE2..| |
| | | ...... | | |
| | | | | |
| | /-----------------------\ | |
| | \-------LV 1------------/ | |
| | ..PEn| | ..PEn| |
| \--------/ \--------/ |
\------------------------------------------/
PV 1 could be /dev/sdc1 sized 3GB
PV n could be /dev/sde1 sized 4GB
VG 1 could be test_vg
LV 1 could be /dev/test_vg/test_lv
VGDA is the volume group descriptor area holding the LVM metadata
PE1 up to PEn is the number of physical extents on each disk(partition)
Installation steps see INSTALL and insmod(1)/modprobe(1), kmod/kerneld(8)
to load the logical volume manager module if you did not bind it
into the kernel.
Configuration steps for getting the above scenario:
1. Set the partition system id to 0x8e on /dev/sdc1 and /dev/sde1.
2. do a "pvcreate /dev/sd[ce]1"
For testing purposes you can use more than one partition on a disk.
You should not use more than one partition because in the case of
a striped LV you'll have a performance breakdown.
3. do a "vgcreate test_vg /dev/sd[ce]1" to create the new VG named "test_vg"
which has the total capacity of both partitions.
vgcreate activates (transfers the metadata into the LVM driver in the kernel)
the new volume group too to be able to create LVs in the next step.
4. do a "lvcreate -L1500 -ntest_lv test_vg" to get a 1500MB linear LV named
"test_lv" and it's block device special "/dev/test_vg/test_lv".
Or do a "lvcreate -i2 -I4 -l1500 -nanother_test_lv test_vg" to get a 100 LE
large logical volume with 2 stripes and stripesize 4 KB.
5. For example generate a filesystem in one LV with
"mke2fs /dev/test_vg/test_lv" and mount it.
6. extend /dev/test_vg/test_lv to 1600MB with relative size by
"lvextend -L+100 /dev/test_vg/test_lv"
or with absolute size by
"lvextend -L1600 /dev/test_vg/test_lv"
7. reduce /dev/test_vg/test_lv to 900 logical extents with relative extents by
"lvreduce -l-700 /dev/test_vg/test_lv"
or with absolute extents by
"lvreduce -l900 /dev/test_vg/test_lv"
9. rename a VG by deactivating it with
"vgchange -an test_vg" # only VGs with _no_ open LVs can be deactivated!
"vgrename test_vg whatever"
and reactivate it again by
"vgchange -ay whatever"
9. rename a LV after closing it by
"lvchange -an /dev/whatever/test_lv" # only closed LVs can be deactivated
"lvrename /dev/whatever/test_lv /dev/whatever/whatvolume"
or by
"lvrename whatever test_lv whatvolume"
and reactivate it again by
"lvchange -ay /dev/whatever/whatvolume"
10. if you own Ted Tso's/Powerquest's resize2fs program, you are able to
resize the ext2 type filesystems contained in logical volumes without
destroyiing the data by
"e2fsadm -L+100 /dev/test_vg/another_test_lv"
...@@ -188,5 +188,3 @@ Code Seq# Include File Comments ...@@ -188,5 +188,3 @@ Code Seq# Include File Comments
0xB1 00-1F PPPoX <mailto:mostrows@styx.uwaterloo.ca> 0xB1 00-1F PPPoX <mailto:mostrows@styx.uwaterloo.ca>
0xCB 00-1F CBM serial IEC bus in development: 0xCB 00-1F CBM serial IEC bus in development:
<mailto:michael.klein@puffin.lb.shuttle.de> <mailto:michael.klein@puffin.lb.shuttle.de>
0xFE 00-9F Logical Volume Manager <mailto:linux-lvm@sistina.com>
...@@ -990,12 +990,6 @@ L: ldm-devel@lists.sourceforge.net ...@@ -990,12 +990,6 @@ L: ldm-devel@lists.sourceforge.net
W: http://ldm.sourceforge.net W: http://ldm.sourceforge.net
S: Maintained S: Maintained
LOGICAL VOLUME MANAGER
P: Heinz Mauelshagen
L: linux-LVM@sistina.com
W: http://www.sistina.com/lvm
S: Maintained
LSILOGIC/SYMBIOS/NCR 53C8XX and 53C1010 PCI-SCSI drivers LSILOGIC/SYMBIOS/NCR 53C8XX and 53C1010 PCI-SCSI drivers
P: Gerard Roudier P: Gerard Roudier
M: groudier@free.fr M: groudier@free.fr
......
CONFIG_BLK_DEV_LVM
This driver lets you combine several hard disks, hard disk
partitions, multiple devices or even loop devices (for evaluation
purposes) into a volume group. Imagine a volume group as a kind of
virtual disk. Logical volumes, which can be thought of as virtual
partitions, can be created in the volume group. You can resize
volume groups and logical volumes after creation time, corresponding
to new capacity needs. Logical volumes are accessed as block
devices named /dev/VolumeGroupName/LogicalVolumeName.
For details see <file:Documentation/LVM-HOWTO>. You will need
supporting user space software; location is in
<file:Documentation/Changes>.
If you want to compile this support as a module ( = code which can
be inserted in and removed from the running kernel whenever you
want), say M here and read <file:Documentation/modules.txt>. The
module will be called lvm-mod.o.
CONFIG_MD CONFIG_MD
Support multiple physical spindles through a single logical device. Support multiple physical spindles through a single logical device.
Required for RAID and logical volume management (LVM). Required for RAID and logical volume management.
CONFIG_BLK_DEV_MD CONFIG_BLK_DEV_MD
This driver lets you combine several hard disk partitions into one This driver lets you combine several hard disk partitions into one
......
...@@ -12,8 +12,6 @@ dep_tristate ' RAID-0 (striping) mode' CONFIG_MD_RAID0 $CONFIG_BLK_DEV_MD ...@@ -12,8 +12,6 @@ dep_tristate ' RAID-0 (striping) mode' CONFIG_MD_RAID0 $CONFIG_BLK_DEV_MD
dep_tristate ' RAID-1 (mirroring) mode' CONFIG_MD_RAID1 $CONFIG_BLK_DEV_MD dep_tristate ' RAID-1 (mirroring) mode' CONFIG_MD_RAID1 $CONFIG_BLK_DEV_MD
dep_tristate ' RAID-4/RAID-5 mode' CONFIG_MD_RAID5 $CONFIG_BLK_DEV_MD dep_tristate ' RAID-4/RAID-5 mode' CONFIG_MD_RAID5 $CONFIG_BLK_DEV_MD
dep_tristate ' Multipath I/O support' CONFIG_MD_MULTIPATH $CONFIG_BLK_DEV_MD dep_tristate ' Multipath I/O support' CONFIG_MD_MULTIPATH $CONFIG_BLK_DEV_MD
dep_tristate ' Logical volume manager (LVM) support' CONFIG_BLK_DEV_LVM $CONFIG_MD
dep_tristate ' Device mapper support' CONFIG_BLK_DEV_DM $CONFIG_MD dep_tristate ' Device mapper support' CONFIG_BLK_DEV_DM $CONFIG_MD
endmenu endmenu
...@@ -3,7 +3,6 @@ ...@@ -3,7 +3,6 @@
# #
export-objs := md.o xor.o dm-table.o dm-target.o export-objs := md.o xor.o dm-table.o dm-target.o
lvm-mod-objs := lvm.o lvm-snap.o lvm-fs.o
dm-mod-objs := dm.o dm-table.o dm-target.o dm-linear.o dm-stripe.o \ dm-mod-objs := dm.o dm-table.o dm-target.o dm-linear.o dm-stripe.o \
dm-ioctl.o dm-ioctl.o
...@@ -18,7 +17,6 @@ obj-$(CONFIG_MD_RAID1) += raid1.o ...@@ -18,7 +17,6 @@ obj-$(CONFIG_MD_RAID1) += raid1.o
obj-$(CONFIG_MD_RAID5) += raid5.o xor.o obj-$(CONFIG_MD_RAID5) += raid5.o xor.o
obj-$(CONFIG_MD_MULTIPATH) += multipath.o obj-$(CONFIG_MD_MULTIPATH) += multipath.o
obj-$(CONFIG_BLK_DEV_MD) += md.o obj-$(CONFIG_BLK_DEV_MD) += md.o
obj-$(CONFIG_BLK_DEV_LVM) += lvm-mod.o
obj-$(CONFIG_BLK_DEV_DM) += dm-mod.o obj-$(CONFIG_BLK_DEV_DM) += dm-mod.o
include $(TOPDIR)/Rules.make include $(TOPDIR)/Rules.make
......
/*
* kernel/lvm-fs.c
*
* Copyright (C) 2001 Sistina Software
*
* January,February 2001
*
* LVM driver is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2, or (at your option)
* any later version.
*
* LVM driver is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with GNU CC; see the file COPYING. If not, write to
* the Free Software Foundation, 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
*
*/
/*
* Changelog
*
* 11/01/2001 - First version (Joe Thornber)
* 21/03/2001 - added display of stripes and stripe size (HM)
* 04/10/2001 - corrected devfs_register() call in lvm_init_fs()
* 11/04/2001 - don't devfs_register("lvm") as user-space always does it
* 10/05/2001 - show more of PV name in /proc/lvm/global
* 16/12/2001 - fix devfs unregister order and prevent duplicate unreg (REG)
*
*/
#include <linux/config.h>
#include <linux/version.h>
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/vmalloc.h>
#include <linux/smp_lock.h>
#include <linux/devfs_fs_kernel.h>
#include <linux/proc_fs.h>
#include <linux/init.h>
#include <linux/lvm.h>
#include "lvm-internal.h"
static int _proc_read_vg(char *page, char **start, off_t off,
int count, int *eof, void *data);
static int _proc_read_lv(char *page, char **start, off_t off,
int count, int *eof, void *data);
static int _proc_read_pv(char *page, char **start, off_t off,
int count, int *eof, void *data);
static int _proc_read_global(char *page, char **start, off_t off,
int count, int *eof, void *data);
static int _vg_info(vg_t *vg_ptr, char *buf);
static int _lv_info(vg_t *vg_ptr, lv_t *lv_ptr, char *buf);
static int _pv_info(pv_t *pv_ptr, char *buf);
static void _show_uuid(const char *src, char *b, char *e);
static devfs_handle_t lvm_devfs_handle;
static devfs_handle_t vg_devfs_handle[MAX_VG];
static devfs_handle_t ch_devfs_handle[MAX_VG];
static devfs_handle_t lv_devfs_handle[MAX_LV];
static struct proc_dir_entry *lvm_proc_dir = NULL;
static struct proc_dir_entry *lvm_proc_vg_subdir = NULL;
/* inline functions */
/* public interface */
void __init lvm_init_fs() {
struct proc_dir_entry *pde;
/* Must create device node. Think about "devfs=only" situation */
lvm_devfs_handle = devfs_register(
0 , "lvm", 0, LVM_CHAR_MAJOR, 0,
S_IFCHR | S_IRUSR | S_IWUSR | S_IRGRP,
&lvm_chr_fops, NULL);
lvm_proc_dir = create_proc_entry(LVM_DIR, S_IFDIR, &proc_root);
if (lvm_proc_dir) {
lvm_proc_vg_subdir = create_proc_entry(LVM_VG_SUBDIR, S_IFDIR,
lvm_proc_dir);
pde = create_proc_entry(LVM_GLOBAL, S_IFREG, lvm_proc_dir);
if ( pde != NULL) pde->read_proc = _proc_read_global;
}
}
void lvm_fin_fs() {
devfs_unregister (lvm_devfs_handle);
remove_proc_entry(LVM_GLOBAL, lvm_proc_dir);
remove_proc_entry(LVM_VG_SUBDIR, lvm_proc_dir);
remove_proc_entry(LVM_DIR, &proc_root);
}
void lvm_fs_create_vg(vg_t *vg_ptr) {
struct proc_dir_entry *pde;
vg_devfs_handle[vg_ptr->vg_number] =
devfs_mk_dir(0, vg_ptr->vg_name, NULL);
ch_devfs_handle[vg_ptr->vg_number] = devfs_register(
vg_devfs_handle[vg_ptr->vg_number] , "group",
DEVFS_FL_DEFAULT, LVM_CHAR_MAJOR, vg_ptr->vg_number,
S_IFCHR | S_IRUSR | S_IWUSR | S_IRGRP,
&lvm_chr_fops, NULL);
vg_ptr->vg_dir_pde = create_proc_entry(vg_ptr->vg_name, S_IFDIR,
lvm_proc_vg_subdir);
if((pde = create_proc_entry("group", S_IFREG, vg_ptr->vg_dir_pde))) {
pde->read_proc = _proc_read_vg;
pde->data = vg_ptr;
}
vg_ptr->lv_subdir_pde =
create_proc_entry(LVM_LV_SUBDIR, S_IFDIR, vg_ptr->vg_dir_pde);
vg_ptr->pv_subdir_pde =
create_proc_entry(LVM_PV_SUBDIR, S_IFDIR, vg_ptr->vg_dir_pde);
}
void lvm_fs_remove_vg(vg_t *vg_ptr) {
int i;
devfs_unregister(ch_devfs_handle[vg_ptr->vg_number]);
ch_devfs_handle[vg_ptr->vg_number] = NULL;
/* remove lv's */
for(i = 0; i < vg_ptr->lv_max; i++)
if(vg_ptr->lv[i]) lvm_fs_remove_lv(vg_ptr, vg_ptr->lv[i]);
/* remove pv's */
for(i = 0; i < vg_ptr->pv_max; i++)
if(vg_ptr->pv[i]) lvm_fs_remove_pv(vg_ptr, vg_ptr->pv[i]);
/* must not remove directory before leaf nodes */
devfs_unregister(vg_devfs_handle[vg_ptr->vg_number]);
vg_devfs_handle[vg_ptr->vg_number] = NULL;
if(vg_ptr->vg_dir_pde) {
remove_proc_entry(LVM_LV_SUBDIR, vg_ptr->vg_dir_pde);
vg_ptr->lv_subdir_pde = NULL;
remove_proc_entry(LVM_PV_SUBDIR, vg_ptr->vg_dir_pde);
vg_ptr->pv_subdir_pde = NULL;
remove_proc_entry("group", vg_ptr->vg_dir_pde);
vg_ptr->vg_dir_pde = NULL;
remove_proc_entry(vg_ptr->vg_name, lvm_proc_vg_subdir);
}
}
static inline const char *_basename(const char *str) {
const char *name = strrchr(str, '/');
name = name ? name + 1 : str;
return name;
}
devfs_handle_t lvm_fs_create_lv(vg_t *vg_ptr, lv_t *lv) {
struct proc_dir_entry *pde;
const char *name = _basename(lv->u.lv_name);
lv_devfs_handle[minor(lv->u.lv_dev)] = devfs_register(
vg_devfs_handle[vg_ptr->vg_number], name,
DEVFS_FL_DEFAULT, LVM_BLK_MAJOR, minor(lv->u.lv_dev),
S_IFBLK | S_IRUSR | S_IWUSR | S_IRGRP,
&lvm_blk_dops, NULL);
if(vg_ptr->lv_subdir_pde &&
(pde = create_proc_entry(name, S_IFREG, vg_ptr->lv_subdir_pde))) {
pde->read_proc = _proc_read_lv;
pde->data = lv;
}
return lv_devfs_handle[minor(lv->u.lv_dev)];
}
void lvm_fs_remove_lv(vg_t *vg_ptr, lv_t *lv) {
devfs_unregister(lv_devfs_handle[minor(lv->u.lv_dev)]);
lv_devfs_handle[minor(lv->u.lv_dev)] = NULL;
if(vg_ptr->lv_subdir_pde) {
const char *name = _basename(lv->u.lv_name);
remove_proc_entry(name, vg_ptr->lv_subdir_pde);
}
}
static inline void _make_pv_name(const char *src, char *b, char *e) {
int offset = strlen(LVM_DIR_PREFIX);
if(strncmp(src, LVM_DIR_PREFIX, offset))
offset = 0;
e--;
src += offset;
while(*src && (b != e)) {
*b++ = (*src == '/') ? '_' : *src;
src++;
}
*b = '\0';
}
void lvm_fs_create_pv(vg_t *vg_ptr, pv_t *pv) {
struct proc_dir_entry *pde;
char name[NAME_LEN];
if(!vg_ptr->pv_subdir_pde)
return;
_make_pv_name(pv->pv_name, name, name + sizeof(name));
if((pde = create_proc_entry(name, S_IFREG, vg_ptr->pv_subdir_pde))) {
pde->read_proc = _proc_read_pv;
pde->data = pv;
}
}
void lvm_fs_remove_pv(vg_t *vg_ptr, pv_t *pv) {
char name[NAME_LEN];
if(!vg_ptr->pv_subdir_pde)
return;
_make_pv_name(pv->pv_name, name, name + sizeof(name));
remove_proc_entry(name, vg_ptr->pv_subdir_pde);
}
static int _proc_read_vg(char *page, char **start, off_t off,
int count, int *eof, void *data) {
int sz = 0;
vg_t *vg_ptr = data;
char uuid[NAME_LEN];
sz += sprintf(page + sz, "name: %s\n", vg_ptr->vg_name);
sz += sprintf(page + sz, "size: %u\n",
vg_ptr->pe_total * vg_ptr->pe_size / 2);
sz += sprintf(page + sz, "access: %u\n", vg_ptr->vg_access);
sz += sprintf(page + sz, "status: %u\n", vg_ptr->vg_status);
sz += sprintf(page + sz, "number: %u\n", vg_ptr->vg_number);
sz += sprintf(page + sz, "LV max: %u\n", vg_ptr->lv_max);
sz += sprintf(page + sz, "LV current: %u\n", vg_ptr->lv_cur);
sz += sprintf(page + sz, "LV open: %u\n", vg_ptr->lv_open);
sz += sprintf(page + sz, "PV max: %u\n", vg_ptr->pv_max);
sz += sprintf(page + sz, "PV current: %u\n", vg_ptr->pv_cur);
sz += sprintf(page + sz, "PV active: %u\n", vg_ptr->pv_act);
sz += sprintf(page + sz, "PE size: %u\n", vg_ptr->pe_size / 2);
sz += sprintf(page + sz, "PE total: %u\n", vg_ptr->pe_total);
sz += sprintf(page + sz, "PE allocated: %u\n", vg_ptr->pe_allocated);
_show_uuid(vg_ptr->vg_uuid, uuid, uuid + sizeof(uuid));
sz += sprintf(page + sz, "uuid: %s\n", uuid);
return sz;
}
static int _proc_read_lv(char *page, char **start, off_t off,
int count, int *eof, void *data) {
int sz = 0;
lv_t *lv = data;
sz += sprintf(page + sz, "name: %s\n", lv->u.lv_name);
sz += sprintf(page + sz, "size: %u\n", lv->u.lv_size);
sz += sprintf(page + sz, "access: %u\n", lv->u.lv_access);
sz += sprintf(page + sz, "status: %u\n", lv->u.lv_status);
sz += sprintf(page + sz, "number: %u\n", lv->u.lv_number);
sz += sprintf(page + sz, "open: %u\n", lv->u.lv_open);
sz += sprintf(page + sz, "allocation: %u\n", lv->u.lv_allocation);
if(lv->u.lv_stripes > 1) {
sz += sprintf(page + sz, "stripes: %u\n",
lv->u.lv_stripes);
sz += sprintf(page + sz, "stripesize: %u\n",
lv->u.lv_stripesize);
}
sz += sprintf(page + sz, "device: %02u:%02u\n",
major(lv->u.lv_dev), minor(lv->u.lv_dev));
return sz;
}
static int _proc_read_pv(char *page, char **start, off_t off,
int count, int *eof, void *data) {
int sz = 0;
pv_t *pv = data;
char uuid[NAME_LEN];
sz += sprintf(page + sz, "name: %s\n", pv->pv_name);
sz += sprintf(page + sz, "size: %u\n", pv->pv_size);
sz += sprintf(page + sz, "status: %u\n", pv->pv_status);
sz += sprintf(page + sz, "number: %u\n", pv->pv_number);
sz += sprintf(page + sz, "allocatable: %u\n", pv->pv_allocatable);
sz += sprintf(page + sz, "LV current: %u\n", pv->lv_cur);
sz += sprintf(page + sz, "PE size: %u\n", pv->pe_size / 2);
sz += sprintf(page + sz, "PE total: %u\n", pv->pe_total);
sz += sprintf(page + sz, "PE allocated: %u\n", pv->pe_allocated);
sz += sprintf(page + sz, "device: %02u:%02u\n",
major(pv->pv_dev), minor(pv->pv_dev));
_show_uuid(pv->pv_uuid, uuid, uuid + sizeof(uuid));
sz += sprintf(page + sz, "uuid: %s\n", uuid);
return sz;
}
static int _proc_read_global(char *page, char **start, off_t pos, int count,
int *eof, void *data) {
#define LVM_PROC_BUF ( i == 0 ? dummy_buf : &buf[sz])
int c, i, l, p, v, vg_counter, pv_counter, lv_counter, lv_open_counter,
lv_open_total, pe_t_bytes, hash_table_bytes, lv_block_exception_t_bytes, seconds;
static off_t sz;
off_t sz_last;
static char *buf = NULL;
static char dummy_buf[160]; /* sized for 2 lines */
vg_t *vg_ptr;
lv_t *lv_ptr;
pv_t *pv_ptr;
#ifdef DEBUG_LVM_PROC_GET_INFO
printk(KERN_DEBUG
"%s - lvm_proc_get_global_info CALLED pos: %lu count: %d\n",
lvm_name, pos, count);
#endif
if(pos != 0 && buf != NULL)
goto out;
sz_last = vg_counter = pv_counter = lv_counter = lv_open_counter = \
lv_open_total = pe_t_bytes = hash_table_bytes = \
lv_block_exception_t_bytes = 0;
/* get some statistics */
for (v = 0; v < ABS_MAX_VG; v++) {
if ((vg_ptr = vg[v]) != NULL) {
vg_counter++;
pv_counter += vg_ptr->pv_cur;
lv_counter += vg_ptr->lv_cur;
if (vg_ptr->lv_cur > 0) {
for (l = 0; l < vg[v]->lv_max; l++) {
if ((lv_ptr = vg_ptr->lv[l]) != NULL) {
pe_t_bytes += lv_ptr->u.lv_allocated_le;
hash_table_bytes += lv_ptr->lv_snapshot_hash_table_size;
if (lv_ptr->u.lv_block_exception != NULL)
lv_block_exception_t_bytes += lv_ptr->u.lv_remap_end;
if (lv_ptr->u.lv_open > 0) {
lv_open_counter++;
lv_open_total += lv_ptr->u.lv_open;
}
}
}
}
}
}
pe_t_bytes *= sizeof(pe_t);
lv_block_exception_t_bytes *= sizeof(lv_block_exception_t);
if (buf != NULL) {
P_KFREE("%s -- vfree %d\n", lvm_name, __LINE__);
lock_kernel();
vfree(buf);
unlock_kernel();
buf = NULL;
}
/* 2 times: first to get size to allocate buffer,
2nd to fill the malloced buffer */
for (i = 0; i < 2; i++) {
sz = 0;
sz += sprintf(LVM_PROC_BUF,
"LVM "
#ifdef MODULE
"module"
#else
"driver"
#endif
" %s\n\n"
"Total: %d VG%s %d PV%s %d LV%s ",
lvm_version,
vg_counter, vg_counter == 1 ? "" : "s",
pv_counter, pv_counter == 1 ? "" : "s",
lv_counter, lv_counter == 1 ? "" : "s");
sz += sprintf(LVM_PROC_BUF,
"(%d LV%s open",
lv_open_counter,
lv_open_counter == 1 ? "" : "s");
if (lv_open_total > 0)
sz += sprintf(LVM_PROC_BUF,
" %d times)\n",
lv_open_total);
else
sz += sprintf(LVM_PROC_BUF, ")");
sz += sprintf(LVM_PROC_BUF,
"\nGlobal: %lu bytes malloced IOP version: %d ",
vg_counter * sizeof(vg_t) +
pv_counter * sizeof(pv_t) +
lv_counter * sizeof(lv_t) +
pe_t_bytes + hash_table_bytes + lv_block_exception_t_bytes + sz_last,
lvm_iop_version);
seconds = CURRENT_TIME - loadtime;
if (seconds < 0)
loadtime = CURRENT_TIME + seconds;
if (seconds / 86400 > 0) {
sz += sprintf(LVM_PROC_BUF, "%d day%s ",
seconds / 86400,
seconds / 86400 == 0 ||
seconds / 86400 > 1 ? "s" : "");
}
sz += sprintf(LVM_PROC_BUF, "%d:%02d:%02d active\n",
(seconds % 86400) / 3600,
(seconds % 3600) / 60,
seconds % 60);
if (vg_counter > 0) {
for (v = 0; v < ABS_MAX_VG; v++) {
/* volume group */
if ((vg_ptr = vg[v]) != NULL) {
sz += _vg_info(vg_ptr, LVM_PROC_BUF);
/* physical volumes */
sz += sprintf(LVM_PROC_BUF,
"\n PV%s ",
vg_ptr->pv_cur == 1 ? ": " : "s:");
c = 0;
for (p = 0; p < vg_ptr->pv_max; p++) {
if ((pv_ptr = vg_ptr->pv[p]) != NULL) {
sz += _pv_info(pv_ptr, LVM_PROC_BUF);
c++;
if (c < vg_ptr->pv_cur)
sz += sprintf(LVM_PROC_BUF,
"\n ");
}
}
/* logical volumes */
sz += sprintf(LVM_PROC_BUF,
"\n LV%s ",
vg_ptr->lv_cur == 1 ? ": " : "s:");
c = 0;
for (l = 0; l < vg_ptr->lv_max; l++) {
if ((lv_ptr = vg_ptr->lv[l]) != NULL) {
sz += _lv_info(vg_ptr, lv_ptr, LVM_PROC_BUF);
c++;
if (c < vg_ptr->lv_cur)
sz += sprintf(LVM_PROC_BUF,
"\n ");
}
}
if (vg_ptr->lv_cur == 0) sz += sprintf(LVM_PROC_BUF, "none");
sz += sprintf(LVM_PROC_BUF, "\n");
}
}
}
if (buf == NULL) {
lock_kernel();
buf = vmalloc(sz);
unlock_kernel();
if (buf == NULL) {
sz = 0;
return sprintf(page, "%s - vmalloc error at line %d\n",
lvm_name, __LINE__);
}
}
sz_last = sz;
}
out:
if (pos > sz - 1) {
lock_kernel();
vfree(buf);
unlock_kernel();
buf = NULL;
return 0;
}
*start = &buf[pos];
if (sz - pos < count)
return sz - pos;
else
return count;
#undef LVM_PROC_BUF
}
/*
* provide VG info for proc filesystem use (global)
*/
static int _vg_info(vg_t *vg_ptr, char *buf) {
int sz = 0;
char inactive_flag = ' ';
if (!(vg_ptr->vg_status & VG_ACTIVE)) inactive_flag = 'I';
sz = sprintf(buf,
"\nVG: %c%s [%d PV, %d LV/%d open] "
" PE Size: %d KB\n"
" Usage [KB/PE]: %d /%d total "
"%d /%d used %d /%d free",
inactive_flag,
vg_ptr->vg_name,
vg_ptr->pv_cur,
vg_ptr->lv_cur,
vg_ptr->lv_open,
vg_ptr->pe_size >> 1,
vg_ptr->pe_size * vg_ptr->pe_total >> 1,
vg_ptr->pe_total,
vg_ptr->pe_allocated * vg_ptr->pe_size >> 1,
vg_ptr->pe_allocated,
(vg_ptr->pe_total - vg_ptr->pe_allocated) *
vg_ptr->pe_size >> 1,
vg_ptr->pe_total - vg_ptr->pe_allocated);
return sz;
}
/*
* provide LV info for proc filesystem use (global)
*/
static int _lv_info(vg_t *vg_ptr, lv_t *lv_ptr, char *buf) {
int sz = 0;
char inactive_flag = 'A', allocation_flag = ' ',
stripes_flag = ' ', rw_flag = ' ', *basename;
if (!(lv_ptr->u.lv_status & LV_ACTIVE))
inactive_flag = 'I';
rw_flag = 'R';
if (lv_ptr->u.lv_access & LV_WRITE)
rw_flag = 'W';
allocation_flag = 'D';
if (lv_ptr->u.lv_allocation & LV_CONTIGUOUS)
allocation_flag = 'C';
stripes_flag = 'L';
if (lv_ptr->u.lv_stripes > 1)
stripes_flag = 'S';
sz += sprintf(buf+sz,
"[%c%c%c%c",
inactive_flag,
rw_flag,
allocation_flag,
stripes_flag);
if (lv_ptr->u.lv_stripes > 1)
sz += sprintf(buf+sz, "%-2d",
lv_ptr->u.lv_stripes);
else
sz += sprintf(buf+sz, " ");
/* FIXME: use _basename */
basename = strrchr(lv_ptr->u.lv_name, '/');
if ( basename == 0) basename = lv_ptr->u.lv_name;
else basename++;
sz += sprintf(buf+sz, "] %-25s", basename);
if (strlen(basename) > 25)
sz += sprintf(buf+sz,
"\n ");
sz += sprintf(buf+sz, "%9d /%-6d ",
lv_ptr->u.lv_size >> 1,
lv_ptr->u.lv_size / vg_ptr->pe_size);
if (lv_ptr->u.lv_open == 0)
sz += sprintf(buf+sz, "close");
else
sz += sprintf(buf+sz, "%dx open",
lv_ptr->u.lv_open);
return sz;
}
/*
* provide PV info for proc filesystem use (global)
*/
static int _pv_info(pv_t *pv, char *buf) {
int sz = 0;
char inactive_flag = 'A', allocation_flag = ' ';
char *pv_name = NULL;
if (!(pv->pv_status & PV_ACTIVE))
inactive_flag = 'I';
allocation_flag = 'A';
if (!(pv->pv_allocatable & PV_ALLOCATABLE))
allocation_flag = 'N';
pv_name = strchr(pv->pv_name+1,'/');
if ( pv_name == 0) pv_name = pv->pv_name;
else pv_name++;
sz = sprintf(buf,
"[%c%c] %-21s %8d /%-6d "
"%8d /%-6d %8d /%-6d",
inactive_flag,
allocation_flag,
pv_name,
pv->pe_total * pv->pe_size >> 1,
pv->pe_total,
pv->pe_allocated * pv->pe_size >> 1,
pv->pe_allocated,
(pv->pe_total - pv->pe_allocated) *
pv->pe_size >> 1,
pv->pe_total - pv->pe_allocated);
return sz;
}
static void _show_uuid(const char *src, char *b, char *e) {
int i;
e--;
for(i = 0; *src && (b != e); i++) {
if(i && !(i & 0x3))
*b++ = '-';
*b++ = *src++;
}
*b = '\0';
}
MODULE_LICENSE("GPL");
/*
* kernel/lvm-internal.h
*
* Copyright (C) 2001 Sistina Software
*
*
* LVM driver is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2, or (at your option)
* any later version.
*
* LVM driver is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with GNU CC; see the file COPYING. If not, write to
* the Free Software Foundation, 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
*
*/
/*
* Changelog
*
* 05/01/2001:Joe Thornber - Factored this file out of lvm.c
*
*/
#ifndef LVM_INTERNAL_H
#define LVM_INTERNAL_H
#include <linux/lvm.h>
#define _LVM_INTERNAL_H_VERSION "LVM "LVM_RELEASE_NAME" ("LVM_RELEASE_DATE")"
/* global variables, defined in lvm.c */
extern char *lvm_version;
extern ushort lvm_iop_version;
extern int loadtime;
extern const char *const lvm_name;
extern vg_t *vg[];
extern struct file_operations lvm_chr_fops;
extern struct block_device_operations lvm_blk_dops;
/* debug macros */
#ifdef DEBUG_IOCTL
#define P_IOCTL(fmt, args...) printk(KERN_DEBUG "lvm ioctl: " fmt, ## args)
#else
#define P_IOCTL(fmt, args...)
#endif
#ifdef DEBUG_MAP
#define P_MAP(fmt, args...) printk(KERN_DEBUG "lvm map: " fmt, ## args)
#else
#define P_MAP(fmt, args...)
#endif
#ifdef DEBUG_KFREE
#define P_KFREE(fmt, args...) printk(KERN_DEBUG "lvm kfree: " fmt, ## args)
#else
#define P_KFREE(fmt, args...)
#endif
#ifdef DEBUG_DEVICE
#define P_DEV(fmt, args...) printk(KERN_DEBUG "lvm device: " fmt, ## args)
#else
#define P_DEV(fmt, args...)
#endif
/* lvm-snap.c */
int lvm_get_blksize(kdev_t);
int lvm_snapshot_alloc(lv_t *);
int lvm_snapshot_fill_COW_page(vg_t *, lv_t *);
int lvm_snapshot_COW(kdev_t, ulong, ulong, ulong, vg_t *vg, lv_t *);
int lvm_snapshot_remap_block(kdev_t *, ulong *, ulong, lv_t *);
void lvm_snapshot_release(lv_t *);
int lvm_write_COW_table_block(vg_t *, lv_t *);
void lvm_hash_link(lv_block_exception_t *, kdev_t, ulong, lv_t *);
int lvm_snapshot_alloc_hash_table(lv_t *);
void lvm_drop_snapshot(vg_t *vg, lv_t *, const char *);
/* lvm_fs.c */
void lvm_init_fs(void);
void lvm_fin_fs(void);
void lvm_fs_create_vg(vg_t *vg_ptr);
void lvm_fs_remove_vg(vg_t *vg_ptr);
devfs_handle_t lvm_fs_create_lv(vg_t *vg_ptr, lv_t *lv);
void lvm_fs_remove_lv(vg_t *vg_ptr, lv_t *lv);
void lvm_fs_create_pv(vg_t *vg_ptr, pv_t *pv);
void lvm_fs_remove_pv(vg_t *vg_ptr, pv_t *pv);
#endif
/*
* kernel/lvm-snap.c
*
* Copyright (C) 2000 Andrea Arcangeli <andrea@suse.de> SuSE
* Heinz Mauelshagen, Sistina Software (persistent snapshots)
*
* LVM snapshot driver is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2, or (at your option)
* any later version.
*
* LVM snapshot driver is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with GNU CC; see the file COPYING. If not, write to
* the Free Software Foundation, 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
*
*/
/*
* Changelog
*
* 05/07/2000 - implemented persistent snapshot support
* 23/11/2000 - used cpu_to_le64 rather than my own macro
* 25/01/2001 - Put SetPageLocked back in
* 01/02/2001 - A dropped snapshot is now set as inactive
* 12/03/2001 - lvm_pv_get_number changes:
* o made it static
* o renamed it to _pv_get_number
* o pv number is returned in new uint * arg
* o -1 returned on error
* lvm_snapshot_fill_COW_table has a return value too.
* 25/02/2002 - s/LockPage/SetPageLocked/ - akpm@zip.com.au
*
*/
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/vmalloc.h>
#include <linux/blkdev.h>
#include <linux/smp_lock.h>
#include <linux/types.h>
#include <linux/iobuf.h>
#include <linux/lvm.h>
#include "lvm-internal.h"
static char *lvm_snap_version __attribute__ ((unused)) =
"LVM "LVM_RELEASE_NAME" snapshot code ("LVM_RELEASE_DATE")\n";
extern const char *const lvm_name;
extern int lvm_blocksizes[];
void lvm_snapshot_release(lv_t *);
static int _write_COW_table_block(vg_t *vg, lv_t *lv, int idx,
const char **reason);
static void _disable_snapshot(vg_t *vg, lv_t *lv);
static int _pv_get_number(vg_t * vg, kdev_t rdev, uint *pvn) {
uint p;
for(p = 0; p < vg->pv_max; p++) {
if(vg->pv[p] == NULL)
continue;
if(kdev_same(vg->pv[p]->pv_dev, rdev))
break;
}
if(p >= vg->pv_max) {
/* bad news, the snapshot COW table is probably corrupt */
printk(KERN_ERR
"%s -- _pv_get_number failed for rdev = %u\n",
lvm_name, kdev_t_to_nr(rdev));
return -1;
}
*pvn = vg->pv[p]->pv_number;
return 0;
}
#define hashfn(dev,block,mask,chunk_size) \
((HASHDEV(dev)^((block)/(chunk_size))) & (mask))
static inline lv_block_exception_t *
lvm_find_exception_table(kdev_t org_dev, unsigned long org_start, lv_t * lv)
{
struct list_head * hash_table = lv->lv_snapshot_hash_table, * next;
unsigned long mask = lv->lv_snapshot_hash_mask;
int chunk_size = lv->u.lv_chunk_size;
lv_block_exception_t * ret;
int i = 0;
hash_table = &hash_table[hashfn(org_dev, org_start, mask, chunk_size)];
ret = NULL;
for (next = hash_table->next; next != hash_table; next = next->next)
{
lv_block_exception_t * exception;
exception = list_entry(next, lv_block_exception_t, hash);
if (exception->rsector_org == org_start &&
kdev_same(exception->rdev_org, org_dev))
{
if (i)
{
/* fun, isn't it? :) */
list_del(next);
list_add(next, hash_table);
}
ret = exception;
break;
}
i++;
}
return ret;
}
inline void lvm_hash_link(lv_block_exception_t * exception,
kdev_t org_dev, unsigned long org_start,
lv_t * lv)
{
struct list_head * hash_table = lv->lv_snapshot_hash_table;
unsigned long mask = lv->lv_snapshot_hash_mask;
int chunk_size = lv->u.lv_chunk_size;
hash_table = &hash_table[hashfn(org_dev, org_start, mask, chunk_size)];
list_add(&exception->hash, hash_table);
}
int lvm_snapshot_remap_block(kdev_t * org_dev, unsigned long * org_sector,
unsigned long pe_start, lv_t * lv)
{
int ret;
unsigned long pe_off, pe_adjustment, __org_start;
kdev_t __org_dev;
int chunk_size = lv->u.lv_chunk_size;
lv_block_exception_t * exception;
pe_off = pe_start % chunk_size;
pe_adjustment = (*org_sector-pe_off) % chunk_size;
__org_start = *org_sector - pe_adjustment;
__org_dev = *org_dev;
ret = 0;
exception = lvm_find_exception_table(__org_dev, __org_start, lv);
if (exception)
{
*org_dev = exception->rdev_new;
*org_sector = exception->rsector_new + pe_adjustment;
ret = 1;
}
return ret;
}
void lvm_drop_snapshot(vg_t *vg, lv_t *lv_snap, const char *reason)
{
kdev_t last_dev;
int i;
/* no exception storage space available for this snapshot
or error on this snapshot --> release it */
invalidate_buffers(lv_snap->u.lv_dev);
/* wipe the snapshot since it's inconsistent now */
_disable_snapshot(vg, lv_snap);
last_dev = NODEV;
for (i = 0; i < lv_snap->u.lv_remap_ptr; i++) {
if ( !kdev_same(lv_snap->u.lv_block_exception[i].rdev_new,
last_dev)) {
last_dev = lv_snap->u.lv_block_exception[i].rdev_new;
invalidate_buffers(last_dev);
}
}
lvm_snapshot_release(lv_snap);
lv_snap->u.lv_status &= ~LV_ACTIVE;
printk(KERN_INFO
"%s -- giving up to snapshot %s on %s: %s\n",
lvm_name, lv_snap->u.lv_snapshot_org->u.lv_name, lv_snap->u.lv_name,
reason);
}
static inline int lvm_snapshot_prepare_blocks(unsigned long *blocks,
unsigned long start,
int nr_sectors,
int blocksize)
{
int i, sectors_per_block, nr_blocks;
sectors_per_block = blocksize / SECTOR_SIZE;
if(start & (sectors_per_block - 1))
return 0;
nr_blocks = nr_sectors / sectors_per_block;
start /= sectors_per_block;
for (i = 0; i < nr_blocks; i++)
blocks[i] = start++;
return 1;
}
#ifdef DEBUG_SNAPSHOT
static inline void invalidate_snap_cache(unsigned long start, unsigned long nr,
kdev_t dev)
{
struct buffer_head * bh;
int sectors_per_block, i, blksize, minor;
minor = minor(dev);
blksize = lvm_blocksizes[minor];
sectors_per_block = blksize >> 9;
nr /= sectors_per_block;
start /= sectors_per_block;
for (i = 0; i < nr; i++)
{
bh = find_get_block(dev, start++, blksize);
if (bh)
bforget(bh);
}
}
#endif
int lvm_snapshot_fill_COW_page(vg_t * vg, lv_t * lv_snap)
{
uint pvn;
int id = 0, is = lv_snap->u.lv_remap_ptr;
ulong blksize_snap;
lv_COW_table_disk_t * lv_COW_table = (lv_COW_table_disk_t *)
page_address(lv_snap->lv_COW_table_iobuf->maplist[0]);
if (is == 0)
return 0;
is--;
blksize_snap =
block_size(lv_snap->u.lv_block_exception[is].rdev_new);
is -= is % (blksize_snap / sizeof(lv_COW_table_disk_t));
memset(lv_COW_table, 0, blksize_snap);
for ( ; is < lv_snap->u.lv_remap_ptr; is++, id++) {
/* store new COW_table entry */
lv_block_exception_t *be = lv_snap->u.lv_block_exception + is;
if(_pv_get_number(vg, be->rdev_org, &pvn))
goto bad;
lv_COW_table[id].pv_org_number = cpu_to_le64(pvn);
lv_COW_table[id].pv_org_rsector = cpu_to_le64(be->rsector_org);
if(_pv_get_number(vg, be->rdev_new, &pvn))
goto bad;
lv_COW_table[id].pv_snap_number = cpu_to_le64(pvn);
lv_COW_table[id].pv_snap_rsector =
cpu_to_le64(be->rsector_new);
}
return 0;
bad:
printk(KERN_ERR "%s -- lvm_snapshot_fill_COW_page failed", lvm_name);
return -1;
}
/*
* writes a COW exception table sector to disk (HM)
*/
int lvm_write_COW_table_block(vg_t * vg, lv_t *lv_snap)
{
int r;
const char *err;
if((r = _write_COW_table_block(vg, lv_snap,
lv_snap->u.lv_remap_ptr - 1, &err)))
lvm_drop_snapshot(vg, lv_snap, err);
return r;
}
/*
* copy on write handler for one snapshot logical volume
*
* read the original blocks and store it/them on the new one(s).
* if there is no exception storage space free any longer --> release snapshot.
*
* this routine gets called for each _first_ write to a physical chunk.
*/
int lvm_snapshot_COW(kdev_t org_phys_dev,
unsigned long org_phys_sector,
unsigned long org_pe_start,
unsigned long org_virt_sector,
vg_t *vg, lv_t* lv_snap)
{
const char * reason;
kdev_t snap_phys_dev;
struct block_device *org_bdev, *snap_bdev;
unsigned long org_start, snap_start, virt_start, pe_off;
int idx = lv_snap->u.lv_remap_ptr, chunk_size = lv_snap->u.lv_chunk_size;
struct kiobuf * iobuf;
int blksize_snap, blksize_org, min_blksize, max_blksize;
int max_sectors, nr_sectors;
/* check if we are out of snapshot space */
if (idx >= lv_snap->u.lv_remap_end)
goto fail_out_of_space;
/* calculate physical boundaries of source chunk */
pe_off = org_pe_start % chunk_size;
org_start = org_phys_sector - ((org_phys_sector-pe_off) % chunk_size);
virt_start = org_virt_sector - (org_phys_sector - org_start);
/* calculate physical boundaries of destination chunk */
snap_phys_dev = lv_snap->u.lv_block_exception[idx].rdev_new;
snap_start = lv_snap->u.lv_block_exception[idx].rsector_new;
org_bdev = bdget(kdev_t_to_nr(org_phys_dev));
if (!org_bdev)
goto fail_enomem;
snap_bdev = bdget(kdev_t_to_nr(snap_phys_dev));
if (!snap_bdev) {
bdput(org_bdev);
goto fail_enomem;
}
#ifdef DEBUG_SNAPSHOT
printk(KERN_INFO
"%s -- COW: "
"org %s faulting %lu start %lu, snap %s start %lu, "
"size %d, pe_start %lu pe_off %lu, virt_sec %lu\n",
lvm_name,
kdevname(org_phys_dev), org_phys_sector, org_start,
kdevname(snap_phys_dev), snap_start,
chunk_size,
org_pe_start, pe_off,
org_virt_sector);
#endif
iobuf = lv_snap->lv_iobuf;
blksize_org = block_size(org_phys_dev);
blksize_snap = block_size(snap_phys_dev);
max_blksize = max(blksize_org, blksize_snap);
min_blksize = min(blksize_org, blksize_snap);
max_sectors = LVM_MAX_SECTORS * (min_blksize>>9);
if (chunk_size % (max_blksize>>9))
goto fail_blksize;
while (chunk_size)
{
nr_sectors = min(chunk_size, max_sectors);
chunk_size -= nr_sectors;
iobuf->length = nr_sectors << 9;
if(!lvm_snapshot_prepare_blocks(lv_snap->blocks, org_start,
nr_sectors, blksize_org))
goto fail_prepare;
if (brw_kiovec(READ, 1, &iobuf, org_bdev,
lv_snap->blocks, blksize_org) != (nr_sectors<<9))
goto fail_raw_read;
if(!lvm_snapshot_prepare_blocks(lv_snap->blocks, snap_start,
nr_sectors, blksize_snap))
goto fail_prepare;
if (brw_kiovec(WRITE, 1, &iobuf, snap_bdev,
lv_snap->blocks, blksize_snap) !=(nr_sectors<<9))
goto fail_raw_write;
}
#ifdef DEBUG_SNAPSHOT
/* invalidate the logical snapshot buffer cache */
invalidate_snap_cache(virt_start, lv_snap->u.lv_chunk_size,
lv_snap->u.lv_dev);
#endif
/* the original chunk is now stored on the snapshot volume
so update the execption table */
lv_snap->u.lv_block_exception[idx].rdev_org = org_phys_dev;
lv_snap->u.lv_block_exception[idx].rsector_org = org_start;
lvm_hash_link(lv_snap->u.lv_block_exception + idx,
org_phys_dev, org_start, lv_snap);
lv_snap->u.lv_remap_ptr = idx + 1;
if (lv_snap->lv_snapshot_use_rate > 0) {
if (lv_snap->u.lv_remap_ptr * 100 / lv_snap->u.lv_remap_end >= lv_snap->lv_snapshot_use_rate)
wake_up_interruptible(&lv_snap->lv_snapshot_wait);
}
bdput(snap_bdev);
bdput(org_bdev);
return 0;
/* slow path */
out:
bdput(snap_bdev);
bdput(org_bdev);
out1:
lvm_drop_snapshot(vg, lv_snap, reason);
return 1;
fail_out_of_space:
reason = "out of space";
goto out1;
fail_raw_read:
reason = "read error";
goto out;
fail_raw_write:
reason = "write error";
goto out;
fail_blksize:
reason = "blocksize error";
fail_enomem:
reason = "out of memory";
goto out1;
fail_prepare:
reason = "couldn't prepare kiovec blocks "
"(start probably isn't block aligned)";
goto out;
}
int lvm_snapshot_alloc_iobuf_pages(struct kiobuf * iobuf, int sectors)
{
int bytes, nr_pages, err, i;
bytes = sectors * SECTOR_SIZE;
nr_pages = (bytes + ~PAGE_MASK) >> PAGE_SHIFT;
err = expand_kiobuf(iobuf, nr_pages);
if (err) goto out;
err = -ENOMEM;
iobuf->locked = 1;
iobuf->nr_pages = 0;
for (i = 0; i < nr_pages; i++)
{
struct page * page;
page = alloc_page(GFP_KERNEL);
if (!page)
goto out;
iobuf->maplist[i] = page;
SetPageLocked(page);
iobuf->nr_pages++;
}
iobuf->offset = 0;
err = 0;
out:
return err;
}
static int calc_max_buckets(void)
{
unsigned long mem;
mem = num_physpages << PAGE_SHIFT;
mem /= 100;
mem *= 2;
mem /= sizeof(struct list_head);
return mem;
}
int lvm_snapshot_alloc_hash_table(lv_t * lv)
{
int err;
unsigned long buckets, max_buckets, size;
struct list_head * hash;
buckets = lv->u.lv_remap_end;
max_buckets = calc_max_buckets();
buckets = min(buckets, max_buckets);
while (buckets & (buckets-1))
buckets &= (buckets-1);
size = buckets * sizeof(struct list_head);
err = -ENOMEM;
hash = vmalloc(size);
lv->lv_snapshot_hash_table = hash;
if (!hash)
goto out;
lv->lv_snapshot_hash_table_size = size;
lv->lv_snapshot_hash_mask = buckets-1;
while (buckets--)
INIT_LIST_HEAD(hash+buckets);
err = 0;
out:
return err;
}
int lvm_snapshot_alloc(lv_t * lv_snap)
{
int ret, max_sectors;
/* allocate kiovec to do chunk io */
ret = alloc_kiovec(1, &lv_snap->lv_iobuf);
if (ret) goto out;
max_sectors = LVM_MAX_SECTORS << (PAGE_SHIFT-9);
ret = lvm_snapshot_alloc_iobuf_pages(lv_snap->lv_iobuf, max_sectors);
if (ret) goto out_free_kiovec;
/* allocate kiovec to do exception table io */
ret = alloc_kiovec(1, &lv_snap->lv_COW_table_iobuf);
if (ret) goto out_free_kiovec;
ret = lvm_snapshot_alloc_iobuf_pages(lv_snap->lv_COW_table_iobuf,
PAGE_SIZE/SECTOR_SIZE);
if (ret) goto out_free_both_kiovecs;
ret = lvm_snapshot_alloc_hash_table(lv_snap);
if (ret) goto out_free_both_kiovecs;
out:
return ret;
out_free_both_kiovecs:
unmap_kiobuf(lv_snap->lv_COW_table_iobuf);
free_kiovec(1, &lv_snap->lv_COW_table_iobuf);
lv_snap->lv_COW_table_iobuf = NULL;
out_free_kiovec:
unmap_kiobuf(lv_snap->lv_iobuf);
free_kiovec(1, &lv_snap->lv_iobuf);
lv_snap->lv_iobuf = NULL;
if (lv_snap->lv_snapshot_hash_table != NULL)
vfree(lv_snap->lv_snapshot_hash_table);
lv_snap->lv_snapshot_hash_table = NULL;
goto out;
}
void lvm_snapshot_release(lv_t * lv)
{
if (lv->u.lv_block_exception)
{
vfree(lv->u.lv_block_exception);
lv->u.lv_block_exception = NULL;
}
if (lv->lv_snapshot_hash_table)
{
vfree(lv->lv_snapshot_hash_table);
lv->lv_snapshot_hash_table = NULL;
lv->lv_snapshot_hash_table_size = 0;
}
if (lv->lv_iobuf)
{
kiobuf_wait_for_io(lv->lv_iobuf);
unmap_kiobuf(lv->lv_iobuf);
free_kiovec(1, &lv->lv_iobuf);
lv->lv_iobuf = NULL;
}
if (lv->lv_COW_table_iobuf)
{
kiobuf_wait_for_io(lv->lv_COW_table_iobuf);
unmap_kiobuf(lv->lv_COW_table_iobuf);
free_kiovec(1, &lv->lv_COW_table_iobuf);
lv->lv_COW_table_iobuf = NULL;
}
}
static int _write_COW_table_block(vg_t *vg, lv_t *lv_snap,
int idx, const char **reason) {
int blksize_snap;
int end_of_table;
int idx_COW_table;
uint pvn;
ulong snap_pe_start, COW_table_sector_offset,
COW_entries_per_pe, COW_chunks_per_pe, COW_entries_per_block;
ulong blocks[1];
kdev_t snap_phys_dev;
struct block_device *bdev;
lv_block_exception_t *be;
struct kiobuf * COW_table_iobuf = lv_snap->lv_COW_table_iobuf;
lv_COW_table_disk_t * lv_COW_table =
( lv_COW_table_disk_t *) page_address(lv_snap->lv_COW_table_iobuf->maplist[0]);
COW_chunks_per_pe = LVM_GET_COW_TABLE_CHUNKS_PER_PE(vg, lv_snap);
COW_entries_per_pe = LVM_GET_COW_TABLE_ENTRIES_PER_PE(vg, lv_snap);
/* get physical addresse of destination chunk */
snap_phys_dev = lv_snap->u.lv_block_exception[idx].rdev_new;
snap_pe_start = lv_snap->u.lv_block_exception[idx - (idx % COW_entries_per_pe)].rsector_new - lv_snap->u.lv_chunk_size;
bdev = bdget(kdev_t_to_nr(snap_phys_dev));
blksize_snap = block_size(snap_phys_dev);
COW_entries_per_block = blksize_snap / sizeof(lv_COW_table_disk_t);
idx_COW_table = idx % COW_entries_per_pe % COW_entries_per_block;
if ( idx_COW_table == 0) memset(lv_COW_table, 0, blksize_snap);
/* sector offset into the on disk COW table */
COW_table_sector_offset = (idx % COW_entries_per_pe) / (SECTOR_SIZE / sizeof(lv_COW_table_disk_t));
/* COW table block to write next */
blocks[0] = (snap_pe_start + COW_table_sector_offset) >> (blksize_snap >> 10);
/* store new COW_table entry */
be = lv_snap->u.lv_block_exception + idx;
if(_pv_get_number(vg, be->rdev_org, &pvn))
goto fail_pv_get_number;
lv_COW_table[idx_COW_table].pv_org_number = cpu_to_le64(pvn);
lv_COW_table[idx_COW_table].pv_org_rsector =
cpu_to_le64(be->rsector_org);
if(_pv_get_number(vg, snap_phys_dev, &pvn))
goto fail_pv_get_number;
lv_COW_table[idx_COW_table].pv_snap_number = cpu_to_le64(pvn);
lv_COW_table[idx_COW_table].pv_snap_rsector =
cpu_to_le64(be->rsector_new);
COW_table_iobuf->length = blksize_snap;
if (brw_kiovec(WRITE, 1, &COW_table_iobuf, bdev,
blocks, blksize_snap) != blksize_snap)
goto fail_raw_write;
/* initialization of next COW exception table block with zeroes */
end_of_table = idx % COW_entries_per_pe == COW_entries_per_pe - 1;
if (idx_COW_table % COW_entries_per_block == COW_entries_per_block - 1 || end_of_table)
{
/* don't go beyond the end */
if (idx + 1 >= lv_snap->u.lv_remap_end) goto out;
memset(lv_COW_table, 0, blksize_snap);
if (end_of_table)
{
idx++;
snap_phys_dev = lv_snap->u.lv_block_exception[idx].rdev_new;
snap_pe_start = lv_snap->u.lv_block_exception[idx - (idx % COW_entries_per_pe)].rsector_new - lv_snap->u.lv_chunk_size;
bdput(bdev);
bdev = bdget(kdev_t_to_nr(snap_phys_dev));
blksize_snap = block_size(snap_phys_dev);
blocks[0] = snap_pe_start >> (blksize_snap >> 10);
} else blocks[0]++;
if (brw_kiovec(WRITE, 1, &COW_table_iobuf, bdev,
blocks, blksize_snap) !=
blksize_snap)
goto fail_raw_write;
}
out:
bdput(bdev);
return 0;
fail_raw_write:
*reason = "write error";
bdput(bdev);
return 1;
fail_pv_get_number:
*reason = "_pv_get_number failed";
bdput(bdev);
return 1;
}
/*
* FIXME_1.2
* This function is a bit of a hack; we need to ensure that the
* snapshot is never made active again, because it will surely be
* corrupt. At the moment we do not have access to the LVM metadata
* from within the kernel. So we set the first exception to point to
* sector 1 (which will always be within the metadata, and as such
* invalid). User land tools will check for this when they are asked
* to activate the snapshot and prevent this from happening.
*/
static void _disable_snapshot(vg_t *vg, lv_t *lv) {
const char *err;
lv->u.lv_block_exception[0].rsector_org = LVM_SNAPSHOT_DROPPED_SECTOR;
if(_write_COW_table_block(vg, lv, 0, &err) < 0) {
printk(KERN_ERR "%s -- couldn't disable snapshot: %s\n",
lvm_name, err);
}
}
MODULE_LICENSE("GPL");
#error Broken until maintainers will sanitize kdev_t handling
/*
* kernel/lvm.c
*
* Copyright (C) 1997 - 2000 Heinz Mauelshagen, Sistina Software
*
* February-November 1997
* April-May,July-August,November 1998
* January-March,May,July,September,October 1999
* January,February,July,September-November 2000
* January 2001
*
*
* LVM driver is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2, or (at your option)
* any later version.
*
* LVM driver is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with GNU CC; see the file COPYING. If not, write to
* the Free Software Foundation, 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
*
*/
/*
* Changelog
*
* 09/11/1997 - added chr ioctls VG_STATUS_GET_COUNT
* and VG_STATUS_GET_NAMELIST
* 18/01/1998 - change lvm_chr_open/close lock handling
* 30/04/1998 - changed LV_STATUS ioctl to LV_STATUS_BYNAME and
* - added LV_STATUS_BYINDEX ioctl
* - used lvm_status_byname_req_t and
* lvm_status_byindex_req_t vars
* 04/05/1998 - added multiple device support
* 08/05/1998 - added support to set/clear extendable flag in volume group
* 09/05/1998 - changed output of lvm_proc_get_global_info() because of
* support for free (eg. longer) logical volume names
* 12/05/1998 - added spin_locks (thanks to Pascal van Dam
* <pascal@ramoth.xs4all.nl>)
* 25/05/1998 - fixed handling of locked PEs in lvm_map() and lvm_chr_ioctl()
* 26/05/1998 - reactivated verify_area by access_ok
* 07/06/1998 - used vmalloc/vfree instead of kmalloc/kfree to go
* beyond 128/256 KB max allocation limit per call
* - #ifdef blocked spin_lock calls to avoid compile errors
* with 2.0.x
* 11/06/1998 - another enhancement to spinlock code in lvm_chr_open()
* and use of LVM_VERSION_CODE instead of my own macros
* (thanks to Michael Marxmeier <mike@msede.com>)
* 07/07/1998 - added statistics in lvm_map()
* 08/07/1998 - saved statistics in lvm_do_lv_extend_reduce()
* 25/07/1998 - used __initfunc macro
* 02/08/1998 - changes for official char/block major numbers
* 07/08/1998 - avoided init_module() and cleanup_module() to be static
* 30/08/1998 - changed VG lv_open counter from sum of LV lv_open counters
* to sum of LVs open (no matter how often each is)
* 01/09/1998 - fixed lvm_gendisk.part[] index error
* 07/09/1998 - added copying of lv_current_pe-array
* in LV_STATUS_BYINDEX ioctl
* 17/11/1998 - added KERN_* levels to printk
* 13/01/1999 - fixed LV index bug in lvm_do_lv_create() which hit lvrename
* 07/02/1999 - fixed spinlock handling bug in case of LVM_RESET
* by moving spinlock code from lvm_chr_open()
* to lvm_chr_ioctl()
* - added LVM_LOCK_LVM ioctl to lvm_chr_ioctl()
* - allowed LVM_RESET and retrieval commands to go ahead;
* only other update ioctls are blocked now
* - fixed pv->pe to NULL for pv_status
* - using lv_req structure in lvm_chr_ioctl() now
* - fixed NULL ptr reference bug in lvm_do_lv_extend_reduce()
* caused by uncontiguous PV array in lvm_chr_ioctl(VG_REDUCE)
* 09/02/1999 - changed BLKRASET and BLKRAGET in lvm_chr_ioctl() to
* handle lgoical volume private read ahead sector
* - implemented LV read_ahead handling with lvm_blk_read()
* and lvm_blk_write()
* 10/02/1999 - implemented 2.[12].* support function lvm_hd_name()
* to be used in drivers/block/genhd.c by disk_name()
* 12/02/1999 - fixed index bug in lvm_blk_ioctl(), HDIO_GETGEO
* - enhanced gendisk insert/remove handling
* 16/02/1999 - changed to dynamic block minor number allocation to
* have as much as 99 volume groups with 256 logical volumes
* as the grand total; this allows having 1 volume group with
* up to 256 logical volumes in it
* 21/02/1999 - added LV open count information to proc filesystem
* - substituted redundant LVM_RESET code by calls
* to lvm_do_vg_remove()
* 22/02/1999 - used schedule_timeout() to be more responsive
* in case of lvm_do_vg_remove() with lots of logical volumes
* 19/03/1999 - fixed NULL pointer bug in module_init/lvm_init
* 17/05/1999 - used DECLARE_WAIT_QUEUE_HEAD macro (>2.3.0)
* - enhanced lvm_hd_name support
* 03/07/1999 - avoided use of KERNEL_VERSION macro based ifdefs and
* memcpy_tofs/memcpy_fromfs macro redefinitions
* 06/07/1999 - corrected reads/writes statistic counter copy in case
* of striped logical volume
* 28/07/1999 - implemented snapshot logical volumes
* - lvm_chr_ioctl
* - LV_STATUS_BYINDEX
* - LV_STATUS_BYNAME
* - lvm_do_lv_create
* - lvm_do_lv_remove
* - lvm_map
* - new lvm_snapshot_remap_block
* - new lvm_snapshot_remap_new_block
* 08/10/1999 - implemented support for multiple snapshots per
* original logical volume
* 12/10/1999 - support for 2.3.19
* 11/11/1999 - support for 2.3.28
* 21/11/1999 - changed lvm_map() interface to buffer_head based
* 19/12/1999 - support for 2.3.33
* 01/01/2000 - changed locking concept in lvm_map(),
* lvm_do_vg_create() and lvm_do_lv_remove()
* 15/01/2000 - fixed PV_FLUSH bug in lvm_chr_ioctl()
* 24/01/2000 - ported to 2.3.40 including Alan Cox's pointer changes etc.
* 29/01/2000 - used kmalloc/kfree again for all small structures
* 20/01/2000 - cleaned up lvm_chr_ioctl by moving code
* to seperated functions
* - avoided "/dev/" in proc filesystem output
* - avoided inline strings functions lvm_strlen etc.
* 14/02/2000 - support for 2.3.43
* - integrated Andrea Arcagneli's snapshot code
* 25/06/2000 - james (chip) , IKKHAYD! roffl
* 26/06/2000 - enhanced lv_extend_reduce for snapshot logical volume support
* 06/09/2000 - added devfs support
* 07/09/2000 - changed IOP version to 9
* - started to add new char ioctl LV_STATUS_BYDEV_T to support
* getting an lv_t based on the dev_t of the Logical Volume
* 14/09/2000 - enhanced lvm_do_lv_create to upcall VFS functions
* to sync and lock, activate snapshot and unlock the FS
* (to support journaled filesystems)
* 18/09/2000 - hardsector size support
* 27/09/2000 - implemented lvm_do_lv_rename() and lvm_do_vg_rename()
* 30/10/2000 - added Andi Kleen's LV_BMAP ioctl to support LILO
* 01/11/2000 - added memory information on hash tables to
* lvm_proc_get_global_info()
* 02/11/2000 - implemented /proc/lvm/ hierarchy
* 22/11/2000 - changed lvm_do_create_proc_entry_of_pv () to work
* with devfs
* 26/11/2000 - corrected #ifdef locations for PROC_FS
* 28/11/2000 - fixed lvm_do_vg_extend() NULL pointer BUG
* - fixed lvm_do_create_proc_entry_of_pv() buffer tampering BUG
* 08/01/2001 - Removed conditional compiles related to PROC_FS,
* procfs is always supported now. (JT)
* 12/01/2001 - avoided flushing logical volume in case of shrinking
* because of unnecessary overhead in case of heavy updates
* 25/01/2001 - Allow RO open of an inactive LV so it can be reactivated.
* 31/01/2001 - If you try and BMAP a snapshot you now get an -EPERM
* 01/02/2001 - factored __remap_snapshot out of lvm_map
* 12/02/2001 - move devfs code to create VG before LVs
* 14/02/2001 - tidied device defines for blk.h
* - tidied debug statements
* - more lvm_map tidying
* 14/02/2001 - bug: vg[] member not set back to NULL if activation fails
* 28/02/2001 - introduced the P_DEV macro and changed some internel
* functions to be static [AD]
* 28/02/2001 - factored lvm_get_snapshot_use_rate out of blk_ioctl [AD]
* - fixed user address accessing bug in lvm_do_lv_create()
* where the check for an existing LV takes place right at
* the beginning
* 01/03/2001 - Add VG_CREATE_OLD for IOP 10 compatibility
* 02/03/2001 - Don't destroy usermode pointers in lv_t structures duing LV_
* STATUS_BYxxx and remove redundant lv_t variables from same.
* 05/03/2001 - restore copying pe_t array in lvm_do_lv_status_byname. For
* lvdisplay -v (PC)
* - restore copying pe_t array in lvm_do_lv_status_byindex (HM)
* - added copying pe_t array in lvm_do_lv_status_bydev (HM)
* - enhanced lvm_do_lv_status_by{name,index,dev} to be capable
* to copy the lv_block_exception_t array to userspace (HM)
* 08/03/2001 - factored lvm_do_pv_flush out of lvm_chr_ioctl [HM]
* 09/03/2001 - Added _lock_open_count to ensure we only drop the lock
* when the locking process closes.
* 05/04/2001 - lvm_map bugs: don't use b_blocknr/b_dev in lvm_map, it
* destroys stacking devices. call b_end_io on failed maps.
* (Jens Axboe)
* - Defer writes to an extent that is being moved [JT + AD]
* 28/05/2001 - implemented missing BLKSSZGET ioctl [AD]
* 28/12/2001 - buffer_head -> bio
* removed huge allocation of a lv_t on stack
* (Anders Gustafsson)
* 07/01/2002 - fixed sizeof(lv_t) differences in user/kernel-space
* removed another huge allocation of a lv_t on stack
* (Anders Gustafsson)
*
*/
#define MAJOR_NR LVM_BLK_MAJOR
#define DEVICE_OFF(device)
#define LOCAL_END_REQUEST
/* lvm_do_lv_create calls fsync_dev_lockfs()/unlockfs() */
/* #define LVM_VFS_ENHANCEMENT */
#include <linux/config.h>
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/vmalloc.h>
#include <linux/slab.h>
#include <linux/init.h>
#include <linux/hdreg.h>
#include <linux/stat.h>
#include <linux/fs.h>
#include <linux/bio.h>
#include <linux/proc_fs.h>
#include <linux/blkdev.h>
#include <linux/genhd.h>
#include <linux/smp_lock.h>
#include <asm/ioctl.h>
#include <asm/uaccess.h>
#ifdef CONFIG_KERNELD
#include <linux/kerneld.h>
#endif
#include <linux/blk.h>
#include <linux/blkpg.h>
#include <linux/errno.h>
#include <linux/lvm.h>
#include "lvm-internal.h"
#ifndef WRITEA
# define WRITEA WRITE
#endif
/*
* External function prototypes
*/
static int lvm_make_request_fn(request_queue_t*, struct bio *);
static int lvm_blk_ioctl(struct inode *, struct file *, uint, ulong);
static int lvm_blk_open(struct inode *, struct file *);
static int lvm_blk_close(struct inode *, struct file *);
static int lvm_get_snapshot_use_rate(lv_t *lv_ptr, void *arg);
static int lvm_user_bmap(struct inode *, struct lv_bmap *);
static int lvm_chr_open(struct inode *, struct file *);
static int lvm_chr_close(struct inode *, struct file *);
static int lvm_chr_ioctl(struct inode *, struct file *, uint, ulong);
/* End external function prototypes */
/*
* Internal function prototypes
*/
static void lvm_cleanup(void);
static void lvm_init_vars(void);
#ifdef LVM_HD_NAME
extern void (*lvm_hd_name_ptr) (char *, int);
#endif
static int lvm_map(struct bio *);
static int lvm_do_lock_lvm(void);
static int lvm_do_le_remap(vg_t *, void *);
static int lvm_do_pv_create(pv_t *, vg_t *, ulong);
static int lvm_do_pv_remove(vg_t *, ulong);
static int lvm_do_lv_create(int, char *, userlv_t *);
static int lvm_do_lv_extend_reduce(int, char *, userlv_t *);
static int lvm_do_lv_remove(int, char *, int);
static int lvm_do_lv_rename(vg_t *, lv_req_t *, userlv_t *);
static int lvm_do_lv_status_byname(vg_t *r, void *);
static int lvm_do_lv_status_byindex(vg_t *, void *);
static int lvm_do_lv_status_bydev(vg_t *, void *);
static int lvm_do_pe_lock_unlock(vg_t *r, void *);
static int lvm_do_pv_change(vg_t*, void*);
static int lvm_do_pv_status(vg_t *, void *);
static int lvm_do_pv_flush(void *);
static int lvm_do_vg_create(void *, int minor);
static int lvm_do_vg_extend(vg_t *, void *);
static int lvm_do_vg_reduce(vg_t *, void *);
static int lvm_do_vg_rename(vg_t *, void *);
static int lvm_do_vg_remove(int);
static void lvm_geninit(struct gendisk *);
static void __update_hardsectsize(lv_t *lv);
static void _queue_io(struct bio *bh, int rw);
static struct bio *_dequeue_io(void);
static void _flush_io(struct bio *bh);
static int _open_pv(pv_t *pv);
static void _close_pv(pv_t *pv);
static unsigned long _sectors_to_k(unsigned long sect);
#ifdef LVM_HD_NAME
void lvm_hd_name(char *, int);
#endif
/* END Internal function prototypes */
/* variables */
char *lvm_version = "LVM version "LVM_RELEASE_NAME"("LVM_RELEASE_DATE")";
ushort lvm_iop_version = LVM_DRIVER_IOP_VERSION;
int loadtime = 0;
const char *const lvm_name = LVM_NAME;
/* volume group descriptor area pointers */
vg_t *vg[ABS_MAX_VG];
/* map from block minor number to VG and LV numbers */
typedef struct {
int vg_number;
int lv_number;
} vg_lv_map_t;
static vg_lv_map_t vg_lv_map[ABS_MAX_LV];
/* Request structures (lvm_chr_ioctl()) */
static pv_change_req_t pv_change_req;
static pv_status_req_t pv_status_req;
volatile static pe_lock_req_t pe_lock_req;
static le_remap_req_t le_remap_req;
static lv_req_t lv_req;
#ifdef LVM_TOTAL_RESET
static int lvm_reset_spindown = 0;
#endif
static char pv_name[NAME_LEN];
/* static char rootvg[NAME_LEN] = { 0, }; */
static int lock = 0;
static int _lock_open_count = 0;
static uint vg_count = 0;
static long lvm_chr_open_count = 0;
static DECLARE_WAIT_QUEUE_HEAD(lvm_wait);
static spinlock_t lvm_lock = SPIN_LOCK_UNLOCKED;
static spinlock_t lvm_snapshot_lock = SPIN_LOCK_UNLOCKED;
static struct bio *_pe_requests;
static DECLARE_RWSEM(_pe_lock);
struct file_operations lvm_chr_fops = {
open: lvm_chr_open,
release: lvm_chr_close,
ioctl: lvm_chr_ioctl,
};
/* block device operations structure needed for 2.3.38? and above */
struct block_device_operations lvm_blk_dops =
{
owner: THIS_MODULE,
open: lvm_blk_open,
release: lvm_blk_close,
ioctl: lvm_blk_ioctl,
};
/* gendisk structures */
static struct hd_struct lvm_hd_struct[MAX_LV];
static int lvm_blocksizes[MAX_LV];
static int lvm_size[MAX_LV];
static struct gendisk lvm_gendisk =
{
major: MAJOR_NR,
major_name: LVM_NAME,
minor_shift: 0,
part: lvm_hd_struct,
sizes: lvm_size,
nr_real: MAX_LV,
};
/*
* Driver initialization...
*/
int lvm_init(void)
{
if (register_chrdev(LVM_CHAR_MAJOR,
lvm_name, &lvm_chr_fops) < 0) {
printk(KERN_ERR "%s -- register_chrdev failed\n",
lvm_name);
return -EIO;
}
if (register_blkdev(MAJOR_NR, lvm_name, &lvm_blk_dops) < 0) {
printk("%s -- register_blkdev failed\n", lvm_name);
if (unregister_chrdev(LVM_CHAR_MAJOR, lvm_name) < 0)
printk(KERN_ERR
"%s -- unregister_chrdev failed\n",
lvm_name);
return -EIO;
}
lvm_init_fs();
lvm_init_vars();
lvm_geninit(&lvm_gendisk);
add_gendisk(&lvm_gendisk);
#ifdef LVM_HD_NAME
/* reference from drivers/block/genhd.c */
lvm_hd_name_ptr = lvm_hd_name;
#endif
blk_queue_make_request(BLK_DEFAULT_QUEUE(MAJOR_NR), lvm_make_request_fn);
/* initialise the pe lock */
pe_lock_req.lock = UNLOCK_PE;
/* optional read root VGDA */
/*
if ( *rootvg != 0) vg_read_with_pv_and_lv ( rootvg, &vg);
*/
#ifdef MODULE
printk(KERN_INFO "%s module loaded\n", lvm_version);
#else
printk(KERN_INFO "%s\n", lvm_version);
#endif
return 0;
} /* lvm_init() */
/*
* cleanup...
*/
static void lvm_cleanup(void)
{
if (unregister_chrdev(LVM_CHAR_MAJOR, lvm_name) < 0)
printk(KERN_ERR "%s -- unregister_chrdev failed\n",
lvm_name);
if (unregister_blkdev(MAJOR_NR, lvm_name) < 0)
printk(KERN_ERR "%s -- unregister_blkdev failed\n",
lvm_name);
del_gendisk(&lvm_gendisk);
blk_clear(MAJOR_NR);
#ifdef LVM_HD_NAME
/* reference from linux/drivers/block/genhd.c */
lvm_hd_name_ptr = NULL;
#endif
/* unregister with procfs and devfs */
lvm_fin_fs();
#ifdef MODULE
printk(KERN_INFO "%s -- Module successfully deactivated\n", lvm_name);
#endif
return;
} /* lvm_cleanup() */
/*
* support function to initialize lvm variables
*/
static void __init lvm_init_vars(void)
{
int v;
loadtime = CURRENT_TIME;
lvm_lock = lvm_snapshot_lock = SPIN_LOCK_UNLOCKED;
pe_lock_req.lock = UNLOCK_PE;
pe_lock_req.data.lv_dev = NODEV;
pe_lock_req.data.pv_dev = NODEV;
pe_lock_req.data.pv_offset = 0;
/* Initialize VG pointers */
for (v = 0; v < ABS_MAX_VG; v++) vg[v] = NULL;
/* Initialize LV -> VG association */
for (v = 0; v < ABS_MAX_LV; v++) {
/* index ABS_MAX_VG never used for real VG */
vg_lv_map[v].vg_number = ABS_MAX_VG;
vg_lv_map[v].lv_number = -1;
}
return;
} /* lvm_init_vars() */
/********************************************************************
*
* Character device functions
*
********************************************************************/
#define MODE_TO_STR(mode) (mode) & FMODE_READ ? "READ" : "", \
(mode) & FMODE_WRITE ? "WRITE" : ""
/*
* character device open routine
*/
static int lvm_chr_open(struct inode *inode, struct file *file)
{
unsigned int minor = minor(inode->i_rdev);
P_DEV("chr_open MINOR: %d VG#: %d mode: %s%s lock: %d\n",
minor, VG_CHR(minor), MODE_TO_STR(file->f_mode), lock);
/* super user validation */
if (!capable(CAP_SYS_ADMIN)) return -EACCES;
/* Group special file open */
if (VG_CHR(minor) > MAX_VG) return -ENXIO;
spin_lock(&lvm_lock);
if(lock == current->pid)
_lock_open_count++;
spin_unlock(&lvm_lock);
lvm_chr_open_count++;
MOD_INC_USE_COUNT;
return 0;
} /* lvm_chr_open() */
/*
* character device i/o-control routine
*
* Only one changing process can do changing ioctl at one time,
* others will block.
*
*/
static int lvm_chr_ioctl(struct inode *inode, struct file *file,
uint command, ulong a)
{
int minor = minor(inode->i_rdev);
uint extendable, l, v;
void *arg = (void *) a;
userlv_t ulv;
vg_t* vg_ptr = vg[VG_CHR(minor)];
/* otherwise cc will complain about unused variables */
(void) lvm_lock;
P_IOCTL("chr MINOR: %d command: 0x%X arg: %p VG#: %d mode: %s%s\n",
minor, command, arg, VG_CHR(minor), MODE_TO_STR(file->f_mode));
#ifdef LVM_TOTAL_RESET
if (lvm_reset_spindown > 0) return -EACCES;
#endif
/* Main command switch */
switch (command) {
case LVM_LOCK_LVM:
/* lock the LVM */
return lvm_do_lock_lvm();
case LVM_GET_IOP_VERSION:
/* check lvm version to ensure driver/tools+lib
interoperability */
if (copy_to_user(arg, &lvm_iop_version, sizeof(ushort)) != 0)
return -EFAULT;
return 0;
#ifdef LVM_TOTAL_RESET
case LVM_RESET:
/* lock reset function */
lvm_reset_spindown = 1;
for (v = 0; v < ABS_MAX_VG; v++) {
if (vg[v] != NULL) lvm_do_vg_remove(v);
}
#ifdef MODULE
while (GET_USE_COUNT(&__this_module) < 1)
MOD_INC_USE_COUNT;
while (GET_USE_COUNT(&__this_module) > 1)
MOD_DEC_USE_COUNT;
#endif /* MODULE */
lock = 0; /* release lock */
wake_up_interruptible(&lvm_wait);
return 0;
#endif /* LVM_TOTAL_RESET */
case LE_REMAP:
/* remap a logical extent (after moving the physical extent) */
return lvm_do_le_remap(vg_ptr,arg);
case PE_LOCK_UNLOCK:
/* lock/unlock i/o to a physical extent to move it to another
physical volume (move's done in user space's pvmove) */
return lvm_do_pe_lock_unlock(vg_ptr,arg);
case VG_CREATE_OLD:
/* create a VGDA */
return lvm_do_vg_create(arg, minor);
case VG_CREATE:
/* create a VGDA, assume VG number is filled in */
return lvm_do_vg_create(arg, -1);
case VG_EXTEND:
/* extend a volume group */
return lvm_do_vg_extend(vg_ptr, arg);
case VG_REDUCE:
/* reduce a volume group */
return lvm_do_vg_reduce(vg_ptr, arg);
case VG_RENAME:
/* rename a volume group */
return lvm_do_vg_rename(vg_ptr, arg);
case VG_REMOVE:
/* remove an inactive VGDA */
return lvm_do_vg_remove(minor);
case VG_SET_EXTENDABLE:
/* set/clear extendability flag of volume group */
if (vg_ptr == NULL) return -ENXIO;
if (copy_from_user(&extendable, arg, sizeof(extendable)) != 0)
return -EFAULT;
if (extendable == VG_EXTENDABLE ||
extendable == ~VG_EXTENDABLE) {
if (extendable == VG_EXTENDABLE)
vg_ptr->vg_status |= VG_EXTENDABLE;
else
vg_ptr->vg_status &= ~VG_EXTENDABLE;
} else return -EINVAL;
return 0;
case VG_STATUS:
/* get volume group data (only the vg_t struct) */
if (vg_ptr == NULL) return -ENXIO;
if (copy_to_user(arg, vg_ptr, sizeof(vg_t)) != 0)
return -EFAULT;
return 0;
case VG_STATUS_GET_COUNT:
/* get volume group count */
if (copy_to_user(arg, &vg_count, sizeof(vg_count)) != 0)
return -EFAULT;
return 0;
case VG_STATUS_GET_NAMELIST:
/* get volume group names */
for (l = v = 0; v < ABS_MAX_VG; v++) {
if (vg[v] != NULL) {
if (copy_to_user(arg + l * NAME_LEN,
vg[v]->vg_name,
NAME_LEN) != 0)
return -EFAULT;
l++;
}
}
return 0;
case LV_CREATE:
case LV_EXTEND:
case LV_REDUCE:
case LV_REMOVE:
case LV_RENAME:
/* create, extend, reduce, remove or rename a logical volume */
if (vg_ptr == NULL) return -ENXIO;
if (copy_from_user(&lv_req, arg, sizeof(lv_req)) != 0)
return -EFAULT;
if (command != LV_REMOVE) {
if (copy_from_user(&ulv, lv_req.lv, sizeof(userlv_t)) != 0)
return -EFAULT;
}
switch (command) {
case LV_CREATE:
return lvm_do_lv_create(minor, lv_req.lv_name, &ulv);
case LV_EXTEND:
case LV_REDUCE:
return lvm_do_lv_extend_reduce(minor, lv_req.lv_name, &ulv);
case LV_REMOVE:
return lvm_do_lv_remove(minor, lv_req.lv_name, -1);
case LV_RENAME:
return lvm_do_lv_rename(vg_ptr, &lv_req, &ulv);
}
case LV_STATUS_BYNAME:
/* get status of a logical volume by name */
return lvm_do_lv_status_byname(vg_ptr, arg);
case LV_STATUS_BYINDEX:
/* get status of a logical volume by index */
return lvm_do_lv_status_byindex(vg_ptr, arg);
case LV_STATUS_BYDEV:
/* get status of a logical volume by device */
return lvm_do_lv_status_bydev(vg_ptr, arg);
case PV_CHANGE:
/* change a physical volume */
return lvm_do_pv_change(vg_ptr,arg);
case PV_STATUS:
/* get physical volume data (pv_t structure only) */
return lvm_do_pv_status(vg_ptr,arg);
case PV_FLUSH:
/* physical volume buffer flush/invalidate */
return lvm_do_pv_flush(arg);
default:
printk(KERN_WARNING
"%s -- lvm_chr_ioctl: unknown command 0x%x\n",
lvm_name, command);
return -EINVAL;
}
return 0;
} /* lvm_chr_ioctl */
/*
* character device close routine
*/
static int lvm_chr_close(struct inode *inode, struct file *file)
{
P_DEV("chr_close MINOR: %d VG#: %d\n",
minor(inode->i_rdev), VG_CHR(minor(inode->i_rdev)));
#ifdef LVM_TOTAL_RESET
if (lvm_reset_spindown > 0) {
lvm_reset_spindown = 0;
lvm_chr_open_count = 0;
}
#endif
if (lvm_chr_open_count > 0) lvm_chr_open_count--;
spin_lock(&lvm_lock);
if(lock == current->pid) {
if(!_lock_open_count) {
P_DEV("chr_close: unlocking LVM for pid %d\n", lock);
lock = 0;
wake_up_interruptible(&lvm_wait);
} else
_lock_open_count--;
}
spin_unlock(&lvm_lock);
MOD_DEC_USE_COUNT;
return 0;
} /* lvm_chr_close() */
/********************************************************************
*
* Block device functions
*
********************************************************************/
/*
* block device open routine
*/
static int lvm_blk_open(struct inode *inode, struct file *file)
{
int minor = minor(inode->i_rdev);
lv_t *lv_ptr;
vg_t *vg_ptr = vg[VG_BLK(minor)];
P_DEV("blk_open MINOR: %d VG#: %d LV#: %d mode: %s%s\n",
minor, VG_BLK(minor), LV_BLK(minor), MODE_TO_STR(file->f_mode));
#ifdef LVM_TOTAL_RESET
if (lvm_reset_spindown > 0)
return -EPERM;
#endif
if (vg_ptr != NULL &&
(vg_ptr->vg_status & VG_ACTIVE) &&
(lv_ptr = vg_ptr->lv[LV_BLK(minor)]) != NULL &&
LV_BLK(minor) >= 0 &&
LV_BLK(minor) < vg_ptr->lv_max) {
/* Check parallel LV spindown (LV remove) */
if (lv_ptr->u.lv_status & LV_SPINDOWN) return -EPERM;
/* Check inactive LV and open for read/write */
/* We need to be able to "read" an inactive LV
to re-activate it again */
if ((file->f_mode & FMODE_WRITE) &&
(!(lv_ptr->u.lv_status & LV_ACTIVE)))
return -EPERM;
if (!(lv_ptr->u.lv_access & LV_WRITE) &&
(file->f_mode & FMODE_WRITE))
return -EACCES;
/* be sure to increment VG counter */
if (lv_ptr->u.lv_open == 0) vg_ptr->lv_open++;
lv_ptr->u.lv_open++;
MOD_INC_USE_COUNT;
P_DEV("blk_open OK, LV size %d\n", lv_ptr->u.lv_size);
return 0;
}
return -ENXIO;
} /* lvm_blk_open() */
/*
* block device i/o-control routine
*/
static int lvm_blk_ioctl(struct inode *inode, struct file *file,
uint command, ulong a)
{
int minor = minor(inode->i_rdev);
vg_t *vg_ptr = vg[VG_BLK(minor)];
lv_t *lv_ptr = vg_ptr->lv[LV_BLK(minor)];
void *arg = (void *) a;
struct hd_geometry *hd = (struct hd_geometry *) a;
P_IOCTL("blk MINOR: %d command: 0x%X arg: %p VG#: %d LV#: %d "
"mode: %s%s\n", minor, command, arg, VG_BLK(minor),
LV_BLK(minor), MODE_TO_STR(file->f_mode));
switch (command) {
case BLKSSZGET:
/* get block device sector size as needed e.g. by fdisk */
return put_user(bdev_hardsect_size(inode->i_bdev), (int *) arg);
case BLKGETSIZE:
/* return device size */
P_IOCTL("BLKGETSIZE: %u\n", lv_ptr->u.lv_size);
if (put_user(lv_ptr->u.lv_size, (unsigned long *)arg))
return -EFAULT;
break;
case BLKGETSIZE64:
if (put_user((u64)lv_ptr->u.lv_size << 9, (u64 *)arg))
return -EFAULT;
break;
case HDIO_GETGEO:
/* get disk geometry */
P_IOCTL("%s -- lvm_blk_ioctl -- HDIO_GETGEO\n", lvm_name);
if (hd == NULL)
return -EINVAL;
{
unsigned char heads = 64;
unsigned char sectors = 32;
long start = 0;
short cylinders = lv_ptr->u.lv_size / heads / sectors;
if (copy_to_user((char *) &hd->heads, &heads,
sizeof(heads)) != 0 ||
copy_to_user((char *) &hd->sectors, &sectors,
sizeof(sectors)) != 0 ||
copy_to_user((short *) &hd->cylinders,
&cylinders, sizeof(cylinders)) != 0 ||
copy_to_user((long *) &hd->start, &start,
sizeof(start)) != 0)
return -EFAULT;
P_IOCTL("%s -- lvm_blk_ioctl -- cylinders: %d\n",
lvm_name, cylinders);
}
break;
case LV_SET_ACCESS:
/* set access flags of a logical volume */
if (!capable(CAP_SYS_ADMIN)) return -EACCES;
lv_ptr->u.lv_access = (ulong) arg;
if ( lv_ptr->u.lv_access & LV_WRITE)
set_device_ro(lv_ptr->u.lv_dev, 0);
else
set_device_ro(lv_ptr->u.lv_dev, 1);
break;
case LV_SET_STATUS:
/* set status flags of a logical volume */
if (!capable(CAP_SYS_ADMIN)) return -EACCES;
if (!((ulong) arg & LV_ACTIVE) && lv_ptr->u.lv_open > 1)
return -EPERM;
lv_ptr->u.lv_status = (ulong) arg;
break;
case LV_BMAP:
/* turn logical block into (dev_t, block). non privileged. */
/* don't bmap a snapshot, since the mapping can change */
if(lv_ptr->u.lv_access & LV_SNAPSHOT)
return -EPERM;
return lvm_user_bmap(inode, (struct lv_bmap *) arg);
case LV_SET_ALLOCATION:
/* set allocation flags of a logical volume */
if (!capable(CAP_SYS_ADMIN)) return -EACCES;
lv_ptr->u.lv_allocation = (ulong) arg;
break;
case LV_SNAPSHOT_USE_RATE:
return lvm_get_snapshot_use_rate(lv_ptr, arg);
default:
printk(KERN_WARNING
"%s -- lvm_blk_ioctl: unknown command 0x%x\n",
lvm_name, command);
return -EINVAL;
}
return 0;
} /* lvm_blk_ioctl() */
/*
* block device close routine
*/
static int lvm_blk_close(struct inode *inode, struct file *file)
{
int minor = minor(inode->i_rdev);
vg_t *vg_ptr = vg[VG_BLK(minor)];
lv_t *lv_ptr = vg_ptr->lv[LV_BLK(minor)];
P_DEV("blk_close MINOR: %d VG#: %d LV#: %d\n",
minor, VG_BLK(minor), LV_BLK(minor));
if (lv_ptr->u.lv_open == 1) vg_ptr->lv_open--;
lv_ptr->u.lv_open--;
MOD_DEC_USE_COUNT;
return 0;
} /* lvm_blk_close() */
static int lvm_get_snapshot_use_rate(lv_t *lv, void *arg)
{
lv_snapshot_use_rate_req_t lv_rate_req;
if (!(lv->u.lv_access & LV_SNAPSHOT))
return -EPERM;
if (copy_from_user(&lv_rate_req, arg, sizeof(lv_rate_req)))
return -EFAULT;
if (lv_rate_req.rate < 0 || lv_rate_req.rate > 100)
return -EINVAL;
switch (lv_rate_req.block) {
case 0:
lv->lv_snapshot_use_rate = lv_rate_req.rate;
if (lv->u.lv_remap_ptr * 100 / lv->u.lv_remap_end <
lv->lv_snapshot_use_rate)
interruptible_sleep_on(&lv->lv_snapshot_wait);
break;
case O_NONBLOCK:
break;
default:
return -EINVAL;
}
lv_rate_req.rate = lv->u.lv_remap_ptr * 100 / lv->u.lv_remap_end;
return copy_to_user(arg, &lv_rate_req,
sizeof(lv_rate_req)) ? -EFAULT : 0;
}
static int lvm_user_bmap(struct inode *inode, struct lv_bmap *user_result)
{
struct bio bio;
unsigned long block;
int err;
if (get_user(block, &user_result->lv_block))
return -EFAULT;
memset(&bio,0,sizeof(bio));
bio.bi_dev = inode->i_rdev;
bio.bi_size = block_size(bio.bi_dev); /* NEEDED by bio_sectors */
bio.bi_sector = block * bio_sectors(&bio);
bio.bi_rw = READ;
if ((err=lvm_map(&bio)) < 0) {
printk("lvm map failed: %d\n", err);
return -EINVAL;
}
return put_user(kdev_t_to_nr(bio.bi_dev), &user_result->lv_dev) ||
put_user(bio.bi_sector/bio_sectors(&bio), &user_result->lv_block) ?
-EFAULT : 0;
}
/*
* block device support function for /usr/src/linux/drivers/block/ll_rw_blk.c
* (see init_module/lvm_init)
*/
static void __remap_snapshot(kdev_t rdev, ulong rsector,
ulong pe_start, lv_t *lv, vg_t *vg) {
/* copy a chunk from the origin to a snapshot device */
down_write(&lv->lv_lock);
/* we must redo lvm_snapshot_remap_block in order to avoid a
race condition in the gap where no lock was held */
if (!lvm_snapshot_remap_block(&rdev, &rsector, pe_start, lv) &&
!lvm_snapshot_COW(rdev, rsector, pe_start, rsector, vg, lv))
lvm_write_COW_table_block(vg, lv);
up_write(&lv->lv_lock);
}
static inline void _remap_snapshot(kdev_t rdev, ulong rsector,
ulong pe_start, lv_t *lv, vg_t *vg) {
int r;
/* check to see if this chunk is already in the snapshot */
down_read(&lv->lv_lock);
r = lvm_snapshot_remap_block(&rdev, &rsector, pe_start, lv);
up_read(&lv->lv_lock);
if (!r)
/* we haven't yet copied this block to the snapshot */
__remap_snapshot(rdev, rsector, pe_start, lv, vg);
}
/*
* extents destined for a pe that is on the move should be deferred
*/
static inline int _should_defer(kdev_t pv, ulong sector, uint32_t pe_size) {
return ((pe_lock_req.lock == LOCK_PE) &&
kdev_same(pv, pe_lock_req.data.pv_dev) &&
(sector >= pe_lock_req.data.pv_offset) &&
(sector < (pe_lock_req.data.pv_offset + pe_size)));
}
static inline int _defer_extent(struct bio *bh, int rw,
kdev_t pv, ulong sector, uint32_t pe_size)
{
if (pe_lock_req.lock == LOCK_PE) {
down_read(&_pe_lock);
if (_should_defer(pv, sector, pe_size)) {
up_read(&_pe_lock);
down_write(&_pe_lock);
if (_should_defer(pv, sector, pe_size))
_queue_io(bh, rw);
up_write(&_pe_lock);
return 1;
}
up_read(&_pe_lock);
}
return 0;
}
static int lvm_map(struct bio *bi)
{
int minor = minor(bi->bi_dev);
ulong index;
ulong pe_start;
ulong size = bio_sectors(bi);
ulong rsector_org = bi->bi_sector;
ulong rsector_map;
kdev_t rdev_map;
vg_t *vg_this = vg[VG_BLK(minor)];
lv_t *lv = vg_this->lv[LV_BLK(minor)];
int rw = bio_rw(bi);
down_read(&lv->lv_lock);
if (!(lv->u.lv_status & LV_ACTIVE)) {
printk(KERN_ALERT
"%s - lvm_map: ll_rw_blk for inactive LV %s\n",
lvm_name, lv->u.lv_name);
goto bad;
}
if ((rw == WRITE || rw == WRITEA) &&
!(lv->u.lv_access & LV_WRITE)) {
printk(KERN_CRIT
"%s - lvm_map: ll_rw_blk write for readonly LV %s\n",
lvm_name, lv->u.lv_name);
goto bad;
}
P_MAP("%s - lvm_map minor: %d *rdev: %s *rsector: %lu size:%lu\n",
lvm_name, minor,
kdevname(bi->bi_dev),
rsector_org, size);
if (rsector_org + size > lv->u.lv_size) {
printk(KERN_ALERT
"%s - lvm_map access beyond end of device; *rsector: "
"%lu or size: %lu wrong for minor: %2d\n",
lvm_name, rsector_org, size, minor);
goto bad;
}
if (lv->u.lv_stripes < 2) { /* linear mapping */
/* get the index */
index = rsector_org / vg_this->pe_size;
pe_start = lv->u.lv_current_pe[index].pe;
rsector_map = lv->u.lv_current_pe[index].pe +
(rsector_org % vg_this->pe_size);
rdev_map = lv->u.lv_current_pe[index].dev;
P_MAP("u.lv_current_pe[%ld].pe: %d rdev: %s rsector:%ld\n",
index, lv->u.lv_current_pe[index].pe,
kdevname(rdev_map), rsector_map);
} else { /* striped mapping */
ulong stripe_index;
ulong stripe_length;
stripe_length = vg_this->pe_size * lv->u.lv_stripes;
stripe_index = (rsector_org % stripe_length) /
lv->u.lv_stripesize;
index = rsector_org / stripe_length +
(stripe_index % lv->u.lv_stripes) *
(lv->u.lv_allocated_le / lv->u.lv_stripes);
pe_start = lv->u.lv_current_pe[index].pe;
rsector_map = lv->u.lv_current_pe[index].pe +
(rsector_org % stripe_length) -
(stripe_index % lv->u.lv_stripes) * lv->u.lv_stripesize -
stripe_index / lv->u.lv_stripes *
(lv->u.lv_stripes - 1) * lv->u.lv_stripesize;
rdev_map = lv->u.lv_current_pe[index].dev;
P_MAP("u.lv_current_pe[%ld].pe: %d rdev: %s rsector:%ld\n"
"stripe_length: %ld stripe_index: %ld\n",
index, lv->u.lv_current_pe[index].pe, kdevname(rdev_map),
rsector_map, stripe_length, stripe_index);
}
/*
* Queue writes to physical extents on the move until move completes.
* Don't get _pe_lock until there is a reasonable expectation that
* we need to queue this request, because this is in the fast path.
*/
if (rw == WRITE || rw == WRITEA) {
if(_defer_extent(bi, rw, rdev_map,
rsector_map, vg_this->pe_size)) {
up_read(&lv->lv_lock);
return 0;
}
lv->u.lv_current_pe[index].writes++; /* statistic */
} else
lv->u.lv_current_pe[index].reads++; /* statistic */
/* snapshot volume exception handling on physical device address base */
if (!(lv->u.lv_access & (LV_SNAPSHOT|LV_SNAPSHOT_ORG)))
goto out;
if (lv->u.lv_access & LV_SNAPSHOT) { /* remap snapshot */
if (lv->u.lv_block_exception)
lvm_snapshot_remap_block(&rdev_map, &rsector_map,
pe_start, lv);
else
goto bad;
} else if (rw == WRITE || rw == WRITEA) { /* snapshot origin */
lv_t *snap;
/* start with first snapshot and loop through all of
them */
for (snap = lv->u.lv_snapshot_next; snap;
snap = snap->u.lv_snapshot_next) {
/* Check for inactive snapshot */
if (!(snap->u.lv_status & LV_ACTIVE))
continue;
/* Serializes the COW with the accesses to the
snapshot device */
_remap_snapshot(rdev_map, rsector_map,
pe_start, snap, vg_this);
}
}
out:
bi->bi_dev = rdev_map;
bi->bi_sector = rsector_map;
up_read(&lv->lv_lock);
return 1;
bad:
bio_io_error(bi);
up_read(&lv->lv_lock);
return -1;
} /* lvm_map() */
/*
* internal support functions
*/
#ifdef LVM_HD_NAME
/*
* generate "hard disk" name
*/
void lvm_hd_name(char *buf, int minor)
{
int len = 0;
lv_t *lv_ptr;
if (vg[VG_BLK(minor)] == NULL ||
(lv_ptr = vg[VG_BLK(minor)]->lv[LV_BLK(minor)]) == NULL)
return;
len = strlen(lv_ptr->u.lv_name) - 5;
memcpy(buf, &lv_ptr->u.lv_name[5], len);
buf[len] = 0;
return;
}
#endif
/*
* make request function
*/
static int lvm_make_request_fn(request_queue_t *q, struct bio *bio)
{
return (lvm_map(bio) <= 0) ? 0 : 1;
}
/********************************************************************
*
* Character device support functions
*
********************************************************************/
/*
* character device support function logical volume manager lock
*/
static int lvm_do_lock_lvm(void)
{
lock_try_again:
spin_lock(&lvm_lock);
if (lock != 0 && lock != current->pid) {
P_DEV("lvm_do_lock_lvm: locked by pid %d ...\n", lock);
spin_unlock(&lvm_lock);
interruptible_sleep_on(&lvm_wait);
if (signal_pending(current))
return -EINTR;
#ifdef LVM_TOTAL_RESET
if (lvm_reset_spindown > 0)
return -EACCES;
#endif
goto lock_try_again;
}
lock = current->pid;
P_DEV("lvm_do_lock_lvm: locking LVM for pid %d\n", lock);
spin_unlock(&lvm_lock);
return 0;
} /* lvm_do_lock_lvm */
/*
* character device support function lock/unlock physical extend
*/
static int lvm_do_pe_lock_unlock(vg_t *vg_ptr, void *arg)
{
pe_lock_req_t new_lock;
struct bio *bh;
uint p;
if (vg_ptr == NULL) return -ENXIO;
if (copy_from_user(&new_lock, arg, sizeof(new_lock)) != 0)
return -EFAULT;
switch (new_lock.lock) {
case LOCK_PE:
for (p = 0; p < vg_ptr->pv_max; p++) {
if (vg_ptr->pv[p] != NULL &&
kdev_same(new_lock.data.pv_dev,
vg_ptr->pv[p]->pv_dev))
break;
}
if (p == vg_ptr->pv_max) return -ENXIO;
/*
* this sync releaves memory pressure to lessen the
* likelyhood of pvmove being paged out - resulting in
* deadlock.
*
* This method of doing a pvmove is broken
*/
fsync_dev(pe_lock_req.data.lv_dev);
down_write(&_pe_lock);
if (pe_lock_req.lock == LOCK_PE) {
up_write(&_pe_lock);
return -EBUSY;
}
/* Should we do to_kdev_t() on the pv_dev and u.lv_dev??? */
pe_lock_req.lock = LOCK_PE;
pe_lock_req.data.lv_dev = new_lock.data.lv_dev;
pe_lock_req.data.pv_dev = new_lock.data.pv_dev;
pe_lock_req.data.pv_offset = new_lock.data.pv_offset;
up_write(&_pe_lock);
/* some requests may have got through since the fsync */
fsync_dev(pe_lock_req.data.pv_dev);
break;
case UNLOCK_PE:
down_write(&_pe_lock);
pe_lock_req.lock = UNLOCK_PE;
pe_lock_req.data.lv_dev = NODEV;
pe_lock_req.data.pv_dev = NODEV;
pe_lock_req.data.pv_offset = 0;
bh = _dequeue_io();
up_write(&_pe_lock);
/* handle all deferred io for this PE */
_flush_io(bh);
break;
default:
return -EINVAL;
}
return 0;
}
/*
* character device support function logical extend remap
*/
static int lvm_do_le_remap(vg_t *vg_ptr, void *arg)
{
uint l, le;
lv_t *lv_ptr;
if (vg_ptr == NULL) return -ENXIO;
if (copy_from_user(&le_remap_req, arg,
sizeof(le_remap_req_t)) != 0)
return -EFAULT;
for (l = 0; l < vg_ptr->lv_max; l++) {
lv_ptr = vg_ptr->lv[l];
if (lv_ptr != NULL &&
strcmp(lv_ptr->u.lv_name,
le_remap_req.lv_name) == 0) {
for (le = 0; le < lv_ptr->u.lv_allocated_le; le++) {
if (kdev_same(lv_ptr->u.lv_current_pe[le].dev,
le_remap_req.old_dev) &&
lv_ptr->u.lv_current_pe[le].pe ==
le_remap_req.old_pe) {
lv_ptr->u.lv_current_pe[le].dev =
le_remap_req.new_dev;
lv_ptr->u.lv_current_pe[le].pe =
le_remap_req.new_pe;
__update_hardsectsize(lv_ptr);
return 0;
}
}
return -EINVAL;
}
}
return -ENXIO;
} /* lvm_do_le_remap() */
/*
* character device support function VGDA create
*/
static int lvm_do_vg_create(void *arg, int minor)
{
int ret = 0;
ulong l, ls = 0, p, size;
vg_t *vg_ptr;
lv_t **snap_lv_ptr;
lv_t *tmplv;
if ((vg_ptr = kmalloc(sizeof(vg_t),GFP_KERNEL)) == NULL) {
printk(KERN_CRIT
"%s -- VG_CREATE: kmalloc error VG at line %d\n",
lvm_name, __LINE__);
return -ENOMEM;
}
/* get the volume group structure */
if (copy_from_user(vg_ptr, arg, sizeof(vg_t)) != 0) {
P_IOCTL("lvm_do_vg_create ERROR: copy VG ptr %p (%d bytes)\n",
arg, sizeof(vg_t));
kfree(vg_ptr);
return -EFAULT;
}
/* VG_CREATE now uses minor number in VG structure */
if (minor == -1) minor = vg_ptr->vg_number;
/* Validate it */
if (vg[VG_CHR(minor)] != NULL) {
P_IOCTL("lvm_do_vg_create ERROR: VG %d in use\n", minor);
kfree(vg_ptr);
return -EPERM;
}
/* we are not that active so far... */
vg_ptr->vg_status &= ~VG_ACTIVE;
vg_ptr->pe_allocated = 0;
if (vg_ptr->pv_max > ABS_MAX_PV) {
printk(KERN_WARNING
"%s -- Can't activate VG: ABS_MAX_PV too small\n",
lvm_name);
kfree(vg_ptr);
return -EPERM;
}
if (vg_ptr->lv_max > ABS_MAX_LV) {
printk(KERN_WARNING
"%s -- Can't activate VG: ABS_MAX_LV too small for %u\n",
lvm_name, vg_ptr->lv_max);
kfree(vg_ptr);
return -EPERM;
}
/* create devfs and procfs entries */
lvm_fs_create_vg(vg_ptr);
vg[VG_CHR(minor)] = vg_ptr;
/* get the physical volume structures */
vg_ptr->pv_act = vg_ptr->pv_cur = 0;
for (p = 0; p < vg_ptr->pv_max; p++) {
pv_t *pvp;
/* user space address */
if ((pvp = vg_ptr->pv[p]) != NULL) {
ret = lvm_do_pv_create(pvp, vg_ptr, p);
if ( ret != 0) {
lvm_do_vg_remove(minor);
return ret;
}
}
}
size = vg_ptr->lv_max * sizeof(lv_t *);
if ((snap_lv_ptr = vmalloc ( size)) == NULL) {
printk(KERN_CRIT
"%s -- VG_CREATE: vmalloc error snapshot LVs at line %d\n",
lvm_name, __LINE__);
lvm_do_vg_remove(minor);
return -EFAULT;
}
memset(snap_lv_ptr, 0, size);
if ((tmplv = kmalloc(sizeof(lv_t),GFP_KERNEL)) == NULL) {
printk(KERN_CRIT
"%s -- VG_CREATE: kmalloc error LV at line %d\n",
lvm_name, __LINE__);
vfree(snap_lv_ptr);
return -ENOMEM;
}
/* get the logical volume structures */
vg_ptr->lv_cur = 0;
for (l = 0; l < vg_ptr->lv_max; l++) {
lv_t *lvp;
/* user space address */
if ((lvp = vg_ptr->lv[l]) != NULL) {
if (copy_from_user(tmplv, lvp, sizeof(userlv_t)) != 0) {
P_IOCTL("ERROR: copying LV ptr %p (%d bytes)\n",
lvp, sizeof(lv_t));
lvm_do_vg_remove(minor);
vfree(snap_lv_ptr);
kfree(tmplv);
return -EFAULT;
}
if ( tmplv->u.lv_access & LV_SNAPSHOT) {
snap_lv_ptr[ls] = lvp;
vg_ptr->lv[l] = NULL;
ls++;
continue;
}
vg_ptr->lv[l] = NULL;
/* only create original logical volumes for now */
if (lvm_do_lv_create(minor, tmplv->u.lv_name, &tmplv->u) != 0) {
lvm_do_vg_remove(minor);
vfree(snap_lv_ptr);
kfree(tmplv);
return -EFAULT;
}
}
}
/* Second path to correct snapshot logical volumes which are not
in place during first path above */
for (l = 0; l < ls; l++) {
lv_t *lvp = snap_lv_ptr[l];
if (copy_from_user(tmplv, lvp, sizeof(userlv_t)) != 0) {
lvm_do_vg_remove(minor);
vfree(snap_lv_ptr);
kfree(tmplv);
return -EFAULT;
}
if (lvm_do_lv_create(minor, tmplv->u.lv_name, &tmplv->u) != 0) {
lvm_do_vg_remove(minor);
vfree(snap_lv_ptr);
kfree(tmplv);
return -EFAULT;
}
}
vfree(snap_lv_ptr);
kfree(tmplv);
vg_count++;
MOD_INC_USE_COUNT;
/* let's go active */
vg_ptr->vg_status |= VG_ACTIVE;
return 0;
} /* lvm_do_vg_create() */
/*
* character device support function VGDA extend
*/
static int lvm_do_vg_extend(vg_t *vg_ptr, void *arg)
{
int ret = 0;
uint p;
pv_t *pv_ptr;
if (vg_ptr == NULL) return -ENXIO;
if (vg_ptr->pv_cur < vg_ptr->pv_max) {
for (p = 0; p < vg_ptr->pv_max; p++) {
if ( ( pv_ptr = vg_ptr->pv[p]) == NULL) {
ret = lvm_do_pv_create(arg, vg_ptr, p);
if ( ret != 0) return ret;
pv_ptr = vg_ptr->pv[p];
vg_ptr->pe_total += pv_ptr->pe_total;
return 0;
}
}
}
return -EPERM;
} /* lvm_do_vg_extend() */
/*
* character device support function VGDA reduce
*/
static int lvm_do_vg_reduce(vg_t *vg_ptr, void *arg) {
uint p;
pv_t *pv_ptr;
if (vg_ptr == NULL) return -ENXIO;
if (copy_from_user(pv_name, arg, sizeof(pv_name)) != 0)
return -EFAULT;
for (p = 0; p < vg_ptr->pv_max; p++) {
pv_ptr = vg_ptr->pv[p];
if (pv_ptr != NULL &&
strcmp(pv_ptr->pv_name,
pv_name) == 0) {
if (pv_ptr->lv_cur > 0) return -EPERM;
lvm_do_pv_remove(vg_ptr, p);
/* Make PV pointer array contiguous */
for (; p < vg_ptr->pv_max - 1; p++)
vg_ptr->pv[p] = vg_ptr->pv[p + 1];
vg_ptr->pv[p + 1] = NULL;
return 0;
}
}
return -ENXIO;
} /* lvm_do_vg_reduce */
/*
* character device support function VG rename
*/
static int lvm_do_vg_rename(vg_t *vg_ptr, void *arg)
{
int l = 0, p = 0, len = 0;
char vg_name[NAME_LEN] = { 0,};
char lv_name[NAME_LEN] = { 0,};
char *ptr = NULL;
lv_t *lv_ptr = NULL;
pv_t *pv_ptr = NULL;
if (vg_ptr == NULL) return -ENXIO;
if (copy_from_user(vg_name, arg, sizeof(vg_name)) != 0)
return -EFAULT;
lvm_fs_remove_vg(vg_ptr);
strncpy ( vg_ptr->vg_name, vg_name, sizeof ( vg_name)-1);
for ( l = 0; l < vg_ptr->lv_max; l++)
{
if ((lv_ptr = vg_ptr->lv[l]) == NULL) continue;
strncpy(lv_ptr->u.vg_name, vg_name, sizeof ( vg_name));
ptr = strrchr(lv_ptr->u.lv_name, '/');
if (ptr == NULL) ptr = lv_ptr->u.lv_name;
strncpy(lv_name, ptr, sizeof ( lv_name));
len = sizeof(LVM_DIR_PREFIX);
strcpy(lv_ptr->u.lv_name, LVM_DIR_PREFIX);
strncat(lv_ptr->u.lv_name, vg_name, NAME_LEN - len);
len += strlen ( vg_name);
strncat(lv_ptr->u.lv_name, lv_name, NAME_LEN - len);
}
for ( p = 0; p < vg_ptr->pv_max; p++)
{
if ( (pv_ptr = vg_ptr->pv[p]) == NULL) continue;
strncpy(pv_ptr->vg_name, vg_name, NAME_LEN);
}
lvm_fs_create_vg(vg_ptr);
return 0;
} /* lvm_do_vg_rename */
/*
* character device support function VGDA remove
*/
static int lvm_do_vg_remove(int minor)
{
int i;
vg_t *vg_ptr = vg[VG_CHR(minor)];
pv_t *pv_ptr;
if (vg_ptr == NULL) return -ENXIO;
#ifdef LVM_TOTAL_RESET
if (vg_ptr->lv_open > 0 && lvm_reset_spindown == 0)
#else
if (vg_ptr->lv_open > 0)
#endif
return -EPERM;
/* let's go inactive */
vg_ptr->vg_status &= ~VG_ACTIVE;
/* remove from procfs and devfs */
lvm_fs_remove_vg(vg_ptr);
/* free LVs */
/* first free snapshot logical volumes */
for (i = 0; i < vg_ptr->lv_max; i++) {
if (vg_ptr->lv[i] != NULL &&
vg_ptr->lv[i]->u.lv_access & LV_SNAPSHOT) {
lvm_do_lv_remove(minor, NULL, i);
current->state = TASK_UNINTERRUPTIBLE;
schedule_timeout(1);
}
}
/* then free the rest of the LVs */
for (i = 0; i < vg_ptr->lv_max; i++) {
if (vg_ptr->lv[i] != NULL) {
lvm_do_lv_remove(minor, NULL, i);
current->state = TASK_UNINTERRUPTIBLE;
schedule_timeout(1);
}
}
/* free PVs */
for (i = 0; i < vg_ptr->pv_max; i++) {
if ((pv_ptr = vg_ptr->pv[i]) != NULL) {
P_KFREE("%s -- kfree %d\n", lvm_name, __LINE__);
lvm_do_pv_remove(vg_ptr, i);
}
}
P_KFREE("%s -- kfree %d\n", lvm_name, __LINE__);
kfree(vg_ptr);
vg[VG_CHR(minor)] = NULL;
vg_count--;
MOD_DEC_USE_COUNT;
return 0;
} /* lvm_do_vg_remove() */
/*
* character device support function physical volume create
*/
static int lvm_do_pv_create(pv_t *pvp, vg_t *vg_ptr, ulong p) {
pv_t *pv;
int err;
pv = kmalloc(sizeof(pv_t),GFP_KERNEL);
if (pv == NULL) {
printk(KERN_CRIT
"%s -- PV_CREATE: kmalloc error PV at line %d\n",
lvm_name, __LINE__);
return -ENOMEM;
}
memset(pv, 0, sizeof(*pv));
if (copy_from_user(pv, pvp, sizeof(pv_t)) != 0) {
P_IOCTL("lvm_do_pv_create ERROR: copy PV ptr %p (%d bytes)\n",
pvp, sizeof(pv_t));
kfree(pv);
return -EFAULT;
}
if ((err = _open_pv(pv))) {
kfree(pv);
return err;
}
/* We don't need the PE list
in kernel space as with LVs pe_t list (see below) */
pv->pe = NULL;
pv->pe_allocated = 0;
pv->pv_status = PV_ACTIVE;
vg_ptr->pv_act++;
vg_ptr->pv_cur++;
lvm_fs_create_pv(vg_ptr, pv);
vg_ptr->pv[p] = pv;
return 0;
} /* lvm_do_pv_create() */
/*
* character device support function physical volume remove
*/
static int lvm_do_pv_remove(vg_t *vg_ptr, ulong p) {
pv_t *pv = vg_ptr->pv[p];
lvm_fs_remove_pv(vg_ptr, pv);
vg_ptr->pe_total -= pv->pe_total;
vg_ptr->pv_cur--;
vg_ptr->pv_act--;
_close_pv(pv);
kfree(pv);
vg_ptr->pv[p] = NULL;
return 0;
}
static void __update_hardsectsize(lv_t *lv) {
int le, e;
int max_hardsectsize = 0, hardsectsize;
for (le = 0; le < lv->u.lv_allocated_le; le++) {
hardsectsize = get_hardsect_size(lv->u.lv_current_pe[le].dev);
if (hardsectsize == 0)
hardsectsize = 512;
if (hardsectsize > max_hardsectsize)
max_hardsectsize = hardsectsize;
}
/* only perform this operation on active snapshots */
if ((lv->u.lv_access & LV_SNAPSHOT) &&
(lv->u.lv_status & LV_ACTIVE)) {
for (e = 0; e < lv->u.lv_remap_end; e++) {
hardsectsize = get_hardsect_size( lv->u.lv_block_exception[e].rdev_new);
if (hardsectsize == 0)
hardsectsize = 512;
if (hardsectsize > max_hardsectsize)
max_hardsectsize = hardsectsize;
}
}
}
/*
* character device support function logical volume create
*/
static int lvm_do_lv_create(int minor, char *lv_name, userlv_t *ulv)
{
int e, ret, l, le, l_new, p, size, activate = 1;
ulong lv_status_save;
lv_block_exception_t *lvbe = ulv->lv_block_exception;
vg_t *vg_ptr = vg[VG_CHR(minor)];
lv_t *lv_ptr = NULL;
pe_t *pep;
if (!(pep = ulv->lv_current_pe))
return -EINVAL;
if (_sectors_to_k(ulv->lv_chunk_size) > LVM_SNAPSHOT_MAX_CHUNK)
return -EINVAL;
for (l = 0; l < vg_ptr->lv_cur; l++) {
if (vg_ptr->lv[l] != NULL &&
strcmp(vg_ptr->lv[l]->u.lv_name, lv_name) == 0)
return -EEXIST;
}
/* in case of lv_remove(), lv_create() pair */
l_new = -1;
if (vg_ptr->lv[ulv->lv_number] == NULL)
l_new = ulv->lv_number;
else {
for (l = 0; l < vg_ptr->lv_max; l++) {
if (vg_ptr->lv[l] == NULL)
if (l_new == -1) l_new = l;
}
}
if (l_new == -1) return -EPERM;
else l = l_new;
if ((lv_ptr = kmalloc(sizeof(lv_t),GFP_KERNEL)) == NULL) {;
printk(KERN_CRIT "%s -- LV_CREATE: kmalloc error LV at line %d\n",
lvm_name, __LINE__);
return -ENOMEM;
}
/* copy preloaded LV */
memcpy((char *) lv_ptr, (char *) ulv, sizeof(userlv_t));
lv_status_save = lv_ptr->u.lv_status;
lv_ptr->u.lv_status &= ~LV_ACTIVE;
lv_ptr->u.lv_snapshot_org = NULL;
lv_ptr->u.lv_snapshot_prev = NULL;
lv_ptr->u.lv_snapshot_next = NULL;
lv_ptr->u.lv_block_exception = NULL;
lv_ptr->lv_iobuf = NULL;
lv_ptr->lv_COW_table_iobuf = NULL;
lv_ptr->lv_snapshot_hash_table = NULL;
lv_ptr->lv_snapshot_hash_table_size = 0;
lv_ptr->lv_snapshot_hash_mask = 0;
init_rwsem(&lv_ptr->lv_lock);
lv_ptr->lv_snapshot_use_rate = 0;
vg_ptr->lv[l] = lv_ptr;
/* get the PE structures from user space if this
is not a snapshot logical volume */
if (!(lv_ptr->u.lv_access & LV_SNAPSHOT)) {
size = lv_ptr->u.lv_allocated_le * sizeof(pe_t);
if ((lv_ptr->u.lv_current_pe = vmalloc(size)) == NULL) {
printk(KERN_CRIT
"%s -- LV_CREATE: vmalloc error LV_CURRENT_PE of %d Byte "
"at line %d\n",
lvm_name, size, __LINE__);
P_KFREE("%s -- kfree %d\n", lvm_name, __LINE__);
kfree(lv_ptr);
vg_ptr->lv[l] = NULL;
return -ENOMEM;
}
if (copy_from_user(lv_ptr->u.lv_current_pe, pep, size)) {
P_IOCTL("ERROR: copying PE ptr %p (%d bytes)\n",
pep, sizeof(size));
vfree(lv_ptr->u.lv_current_pe);
kfree(lv_ptr);
vg_ptr->lv[l] = NULL;
return -EFAULT;
}
/* correct the PE count in PVs */
for (le = 0; le < lv_ptr->u.lv_allocated_le; le++) {
vg_ptr->pe_allocated++;
for (p = 0; p < vg_ptr->pv_cur; p++) {
if (kdev_same(vg_ptr->pv[p]->pv_dev,
lv_ptr->u.lv_current_pe[le].dev))
vg_ptr->pv[p]->pe_allocated++;
}
}
} else {
/* Get snapshot exception data and block list */
if (lvbe != NULL) {
lv_ptr->u.lv_snapshot_org =
vg_ptr->lv[LV_BLK(lv_ptr->u.lv_snapshot_minor)];
if (lv_ptr->u.lv_snapshot_org != NULL) {
size = lv_ptr->u.lv_remap_end * sizeof(lv_block_exception_t);
if(!size) {
printk(KERN_WARNING
"%s -- zero length exception table requested\n",
lvm_name);
kfree(lv_ptr);
return -EINVAL;
}
if ((lv_ptr->u.lv_block_exception = vmalloc(size)) == NULL) {
printk(KERN_CRIT
"%s -- lvm_do_lv_create: vmalloc error LV_BLOCK_EXCEPTION "
"of %d byte at line %d\n",
lvm_name, size, __LINE__);
P_KFREE("%s -- kfree %d\n", lvm_name,
__LINE__);
kfree(lv_ptr);
vg_ptr->lv[l] = NULL;
return -ENOMEM;
}
if (copy_from_user(lv_ptr->u.lv_block_exception, lvbe, size)) {
vfree(lv_ptr->u.lv_block_exception);
kfree(lv_ptr);
vg_ptr->lv[l] = NULL;
return -EFAULT;
}
if(lv_ptr->u.lv_block_exception[0].rsector_org ==
LVM_SNAPSHOT_DROPPED_SECTOR)
{
printk(KERN_WARNING
"%s -- lvm_do_lv_create: snapshot has been dropped and will not be activated\n",
lvm_name);
activate = 0;
}
/* point to the original logical volume */
lv_ptr = lv_ptr->u.lv_snapshot_org;
lv_ptr->u.lv_snapshot_minor = 0;
lv_ptr->u.lv_snapshot_org = lv_ptr;
/* our new one now back points to the previous last in the chain
which can be the original logical volume */
lv_ptr = vg_ptr->lv[l];
/* now lv_ptr points to our new last snapshot logical volume */
lv_ptr->u.lv_current_pe = lv_ptr->u.lv_snapshot_org->u.lv_current_pe;
lv_ptr->lv_allocated_snapshot_le = lv_ptr->u.lv_allocated_le;
lv_ptr->u.lv_allocated_le = lv_ptr->u.lv_snapshot_org->u.lv_allocated_le;
lv_ptr->u.lv_current_le = lv_ptr->u.lv_snapshot_org->u.lv_current_le;
lv_ptr->u.lv_size = lv_ptr->u.lv_snapshot_org->u.lv_size;
lv_ptr->u.lv_stripes = lv_ptr->u.lv_snapshot_org->u.lv_stripes;
lv_ptr->u.lv_stripesize = lv_ptr->u.lv_snapshot_org->u.lv_stripesize;
/* Update the VG PE(s) used by snapshot reserve space. */
vg_ptr->pe_allocated += lv_ptr->lv_allocated_snapshot_le;
if ((ret = lvm_snapshot_alloc(lv_ptr)) != 0)
{
vfree(lv_ptr->u.lv_block_exception);
kfree(lv_ptr);
vg_ptr->lv[l] = NULL;
return ret;
}
for ( e = 0; e < lv_ptr->u.lv_remap_ptr; e++)
lvm_hash_link (lv_ptr->u.lv_block_exception + e,
lv_ptr->u.lv_block_exception[e].rdev_org,
lv_ptr->u.lv_block_exception[e].rsector_org, lv_ptr);
/* need to fill the COW exception table data
into the page for disk i/o */
if(lvm_snapshot_fill_COW_page(vg_ptr, lv_ptr)) {
kfree(lv_ptr);
vg_ptr->lv[l] = NULL;
return -EINVAL;
}
init_waitqueue_head(&lv_ptr->lv_snapshot_wait);
} else {
kfree(lv_ptr);
vg_ptr->lv[l] = NULL;
return -EFAULT;
}
} else {
kfree(vg_ptr->lv[l]);
vg_ptr->lv[l] = NULL;
return -EINVAL;
}
} /* if ( vg[VG_CHR(minor)]->lv[l]->u.lv_access & LV_SNAPSHOT) */
lv_ptr = vg_ptr->lv[l];
lvm_gendisk.part[minor(lv_ptr->u.lv_dev)].start_sect = 0;
lvm_gendisk.part[minor(lv_ptr->u.lv_dev)].nr_sects = lv_ptr->u.lv_size;
lvm_size[minor(lv_ptr->u.lv_dev)] = lv_ptr->u.lv_size >> 1;
vg_lv_map[minor(lv_ptr->u.lv_dev)].vg_number = vg_ptr->vg_number;
vg_lv_map[minor(lv_ptr->u.lv_dev)].lv_number = lv_ptr->u.lv_number;
vg_ptr->lv_cur++;
lv_ptr->u.lv_status = lv_status_save;
__update_hardsectsize(lv_ptr);
/* optionally add our new snapshot LV */
if (lv_ptr->u.lv_access & LV_SNAPSHOT) {
lv_t *org = lv_ptr->u.lv_snapshot_org, *last;
/* sync the original logical volume */
fsync_dev(org->u.lv_dev);
#ifdef LVM_VFS_ENHANCEMENT
/* VFS function call to sync and lock the filesystem */
fsync_dev_lockfs(org->u.lv_dev);
#endif
down_write(&org->lv_lock);
org->u.lv_access |= LV_SNAPSHOT_ORG;
lv_ptr->u.lv_access &= ~LV_SNAPSHOT_ORG; /* this can only hide an userspace bug */
/* Link in the list of snapshot volumes */
for (last = org; last->u.lv_snapshot_next; last = last->u.lv_snapshot_next);
lv_ptr->u.lv_snapshot_prev = last;
last->u.lv_snapshot_next = lv_ptr;
up_write(&org->lv_lock);
}
/* activate the logical volume */
if(activate)
lv_ptr->u.lv_status |= LV_ACTIVE;
else
lv_ptr->u.lv_status &= ~LV_ACTIVE;
if ( lv_ptr->u.lv_access & LV_WRITE)
set_device_ro(lv_ptr->u.lv_dev, 0);
else
set_device_ro(lv_ptr->u.lv_dev, 1);
#ifdef LVM_VFS_ENHANCEMENT
/* VFS function call to unlock the filesystem */
if (lv_ptr->u.lv_access & LV_SNAPSHOT)
unlockfs(lv_ptr->u.lv_snapshot_org->u.lv_dev);
#endif
lv_ptr->vg = vg_ptr;
lvm_gendisk.part[minor(lv_ptr->u.lv_dev)].de =
lvm_fs_create_lv(vg_ptr, lv_ptr);
return 0;
} /* lvm_do_lv_create() */
/*
* character device support function logical volume remove
*/
static int lvm_do_lv_remove(int minor, char *lv_name, int l)
{
uint le, p;
vg_t *vg_ptr = vg[VG_CHR(minor)];
lv_t *lv_ptr;
if (l == -1) {
for (l = 0; l < vg_ptr->lv_max; l++) {
if (vg_ptr->lv[l] != NULL &&
strcmp(vg_ptr->lv[l]->u.lv_name, lv_name) == 0) {
break;
}
}
}
if (l == vg_ptr->lv_max) return -ENXIO;
lv_ptr = vg_ptr->lv[l];
#ifdef LVM_TOTAL_RESET
if (lv_ptr->u.lv_open > 0 && lvm_reset_spindown == 0)
#else
if (lv_ptr->u.lv_open > 0)
#endif
return -EBUSY;
/* check for deletion of snapshot source while
snapshot volume still exists */
if ((lv_ptr->u.lv_access & LV_SNAPSHOT_ORG) &&
lv_ptr->u.lv_snapshot_next != NULL)
return -EPERM;
lvm_fs_remove_lv(vg_ptr, lv_ptr);
if (lv_ptr->u.lv_access & LV_SNAPSHOT) {
/*
* Atomically make the the snapshot invisible
* to the original lv before playing with it.
*/
lv_t * org = lv_ptr->u.lv_snapshot_org;
down_write(&org->lv_lock);
/* remove this snapshot logical volume from the chain */
lv_ptr->u.lv_snapshot_prev->u.lv_snapshot_next = lv_ptr->u.lv_snapshot_next;
if (lv_ptr->u.lv_snapshot_next != NULL) {
lv_ptr->u.lv_snapshot_next->u.lv_snapshot_prev =
lv_ptr->u.lv_snapshot_prev;
}
/* no more snapshots? */
if (!org->u.lv_snapshot_next) {
org->u.lv_access &= ~LV_SNAPSHOT_ORG;
}
up_write(&org->lv_lock);
lvm_snapshot_release(lv_ptr);
/* Update the VG PE(s) used by snapshot reserve space. */
vg_ptr->pe_allocated -= lv_ptr->lv_allocated_snapshot_le;
}
lv_ptr->u.lv_status |= LV_SPINDOWN;
/* sync the buffers */
fsync_dev(lv_ptr->u.lv_dev);
lv_ptr->u.lv_status &= ~LV_ACTIVE;
/* invalidate the buffers */
invalidate_buffers(lv_ptr->u.lv_dev);
/* reset generic hd */
lvm_gendisk.part[minor(lv_ptr->u.lv_dev)].start_sect = -1;
lvm_gendisk.part[minor(lv_ptr->u.lv_dev)].nr_sects = 0;
lvm_gendisk.part[minor(lv_ptr->u.lv_dev)].de = 0;
lvm_size[minor(lv_ptr->u.lv_dev)] = 0;
/* reset VG/LV mapping */
vg_lv_map[minor(lv_ptr->u.lv_dev)].vg_number = ABS_MAX_VG;
vg_lv_map[minor(lv_ptr->u.lv_dev)].lv_number = -1;
/* correct the PE count in PVs if this is not a snapshot
logical volume */
if (!(lv_ptr->u.lv_access & LV_SNAPSHOT)) {
/* only if this is no snapshot logical volume because
we share the u.lv_current_pe[] structs with the
original logical volume */
for (le = 0; le < lv_ptr->u.lv_allocated_le; le++) {
vg_ptr->pe_allocated--;
for (p = 0; p < vg_ptr->pv_cur; p++) {
if (kdev_same(vg_ptr->pv[p]->pv_dev,
lv_ptr->u.lv_current_pe[le].dev))
vg_ptr->pv[p]->pe_allocated--;
}
}
vfree(lv_ptr->u.lv_current_pe);
}
P_KFREE("%s -- kfree %d\n", lvm_name, __LINE__);
kfree(lv_ptr);
vg_ptr->lv[l] = NULL;
vg_ptr->lv_cur--;
return 0;
} /* lvm_do_lv_remove() */
/*
* logical volume extend / reduce
*/
static int __extend_reduce_snapshot(vg_t *vg_ptr, lv_t *old_lv, lv_t *new_lv) {
ulong size;
lv_block_exception_t *lvbe;
if (!new_lv->u.lv_block_exception)
return -ENXIO;
size = new_lv->u.lv_remap_end * sizeof(lv_block_exception_t);
if ((lvbe = vmalloc(size)) == NULL) {
printk(KERN_CRIT
"%s -- lvm_do_lv_extend_reduce: vmalloc "
"error LV_BLOCK_EXCEPTION of %lu Byte at line %d\n",
lvm_name, size, __LINE__);
return -ENOMEM;
}
if ((new_lv->u.lv_remap_end > old_lv->u.lv_remap_end) &&
(copy_from_user(lvbe, new_lv->u.lv_block_exception, size))) {
vfree(lvbe);
return -EFAULT;
}
new_lv->u.lv_block_exception = lvbe;
if (lvm_snapshot_alloc_hash_table(new_lv)) {
vfree(new_lv->u.lv_block_exception);
return -ENOMEM;
}
return 0;
}
static int __extend_reduce(vg_t *vg_ptr, lv_t *old_lv, lv_t *new_lv) {
ulong size, l, p, end;
pe_t *pe;
/* allocate space for new pe structures */
size = new_lv->u.lv_current_le * sizeof(pe_t);
if ((pe = vmalloc(size)) == NULL) {
printk(KERN_CRIT
"%s -- lvm_do_lv_extend_reduce: "
"vmalloc error LV_CURRENT_PE of %lu Byte at line %d\n",
lvm_name, size, __LINE__);
return -ENOMEM;
}
/* get the PE structures from user space */
if (copy_from_user(pe, new_lv->u.lv_current_pe, size)) {
if(old_lv->u.lv_access & LV_SNAPSHOT)
vfree(new_lv->lv_snapshot_hash_table);
vfree(pe);
return -EFAULT;
}
new_lv->u.lv_current_pe = pe;
/* reduce allocation counters on PV(s) */
for (l = 0; l < old_lv->u.lv_allocated_le; l++) {
vg_ptr->pe_allocated--;
for (p = 0; p < vg_ptr->pv_cur; p++) {
if (kdev_same(vg_ptr->pv[p]->pv_dev,
old_lv->u.lv_current_pe[l].dev)) {
vg_ptr->pv[p]->pe_allocated--;
break;
}
}
}
/* extend the PE count in PVs */
for (l = 0; l < new_lv->u.lv_allocated_le; l++) {
vg_ptr->pe_allocated++;
for (p = 0; p < vg_ptr->pv_cur; p++) {
if (kdev_same(vg_ptr->pv[p]->pv_dev,
new_lv->u.lv_current_pe[l].dev)) {
vg_ptr->pv[p]->pe_allocated++;
break;
}
}
}
/* save availiable i/o statistic data */
if (old_lv->u.lv_stripes < 2) { /* linear logical volume */
end = min(old_lv->u.lv_current_le, new_lv->u.lv_current_le);
for (l = 0; l < end; l++) {
new_lv->u.lv_current_pe[l].reads +=
old_lv->u.lv_current_pe[l].reads;
new_lv->u.lv_current_pe[l].writes +=
old_lv->u.lv_current_pe[l].writes;
}
} else { /* striped logical volume */
uint i, j, source, dest, end, old_stripe_size, new_stripe_size;
old_stripe_size = old_lv->u.lv_allocated_le / old_lv->u.lv_stripes;
new_stripe_size = new_lv->u.lv_allocated_le / new_lv->u.lv_stripes;
end = min(old_stripe_size, new_stripe_size);
for (i = source = dest = 0;
i < new_lv->u.lv_stripes; i++) {
for (j = 0; j < end; j++) {
new_lv->u.lv_current_pe[dest + j].reads +=
old_lv->u.lv_current_pe[source + j].reads;
new_lv->u.lv_current_pe[dest + j].writes +=
old_lv->u.lv_current_pe[source + j].writes;
}
source += old_stripe_size;
dest += new_stripe_size;
}
}
return 0;
}
static int lvm_do_lv_extend_reduce(int minor, char *lv_name, userlv_t *ulv)
{
int r;
ulong l, e, size;
vg_t *vg_ptr = vg[VG_CHR(minor)];
lv_t *old_lv;
lv_t *new_lv;
pe_t *pe;
if((new_lv = kmalloc(sizeof(lv_t),GFP_KERNEL)) == NULL){
printk(KERN_CRIT
"%s -- LV_EXTEND/REDUCE: kmallor error LV at line %d\n",
lvm_name,__LINE__);
return -ENOMEM;
}
memset(new_lv,0,sizeof(lv_t));
memcpy(&new_lv->u,ulv,sizeof(userlv_t));
if ((pe = new_lv->u.lv_current_pe) == NULL)
return -EINVAL;
for (l = 0; l < vg_ptr->lv_max; l++)
if (vg_ptr->lv[l] && !strcmp(vg_ptr->lv[l]->u.lv_name, lv_name))
break;
if (l == vg_ptr->lv_max)
return -ENXIO;
old_lv = vg_ptr->lv[l];
if (old_lv->u.lv_access & LV_SNAPSHOT) {
/* only perform this operation on active snapshots */
if (old_lv->u.lv_status & LV_ACTIVE)
r = __extend_reduce_snapshot(vg_ptr, old_lv, new_lv);
else
r = -EPERM;
} else
r = __extend_reduce(vg_ptr, old_lv, new_lv);
if(r)
return r;
/* copy relevent fields */
down_write(&old_lv->lv_lock);
if(new_lv->u.lv_access & LV_SNAPSHOT) {
size = (new_lv->u.lv_remap_end > old_lv->u.lv_remap_end) ?
old_lv->u.lv_remap_ptr : new_lv->u.lv_remap_end;
size *= sizeof(lv_block_exception_t);
memcpy(new_lv->u.lv_block_exception,
old_lv->u.lv_block_exception, size);
old_lv->u.lv_remap_end = new_lv->u.lv_remap_end;
old_lv->u.lv_block_exception = new_lv->u.lv_block_exception;
old_lv->lv_snapshot_hash_table =
new_lv->lv_snapshot_hash_table;
old_lv->lv_snapshot_hash_table_size =
new_lv->lv_snapshot_hash_table_size;
old_lv->lv_snapshot_hash_mask =
new_lv->lv_snapshot_hash_mask;
for (e = 0; e < new_lv->u.lv_remap_ptr; e++)
lvm_hash_link(new_lv->u.lv_block_exception + e,
new_lv->u.lv_block_exception[e].rdev_org,
new_lv->u.lv_block_exception[e].rsector_org,
new_lv);
} else {
vfree(old_lv->u.lv_current_pe);
vfree(old_lv->lv_snapshot_hash_table);
old_lv->u.lv_size = new_lv->u.lv_size;
old_lv->u.lv_allocated_le = new_lv->u.lv_allocated_le;
old_lv->u.lv_current_le = new_lv->u.lv_current_le;
old_lv->u.lv_current_pe = new_lv->u.lv_current_pe;
lvm_gendisk.part[minor(old_lv->u.lv_dev)].nr_sects =
old_lv->u.lv_size;
lvm_size[minor(old_lv->u.lv_dev)] = old_lv->u.lv_size >> 1;
if (old_lv->u.lv_access & LV_SNAPSHOT_ORG) {
lv_t *snap;
for(snap = old_lv->u.lv_snapshot_next; snap;
snap = snap->u.lv_snapshot_next) {
down_write(&snap->lv_lock);
snap->u.lv_current_pe = old_lv->u.lv_current_pe;
snap->u.lv_allocated_le =
old_lv->u.lv_allocated_le;
snap->u.lv_current_le = old_lv->u.lv_current_le;
snap->u.lv_size = old_lv->u.lv_size;
lvm_gendisk.part[minor(snap->u.lv_dev)].nr_sects
= old_lv->u.lv_size;
lvm_size[minor(snap->u.lv_dev)] =
old_lv->u.lv_size >> 1;
__update_hardsectsize(snap);
up_write(&snap->lv_lock);
}
}
}
__update_hardsectsize(old_lv);
up_write(&old_lv->lv_lock);
return 0;
} /* lvm_do_lv_extend_reduce() */
/*
* character device support function logical volume status by name
*/
static int lvm_do_lv_status_byname(vg_t *vg_ptr, void *arg)
{
uint l;
lv_status_byname_req_t lv_status_byname_req;
void *saved_ptr1;
void *saved_ptr2;
lv_t *lv_ptr;
if (vg_ptr == NULL) return -ENXIO;
if (copy_from_user(&lv_status_byname_req, arg,
sizeof(lv_status_byname_req_t)) != 0)
return -EFAULT;
if (lv_status_byname_req.lv == NULL) return -EINVAL;
for (l = 0; l < vg_ptr->lv_max; l++) {
if ((lv_ptr = vg_ptr->lv[l]) != NULL &&
strcmp(lv_ptr->u.lv_name,
lv_status_byname_req.lv_name) == 0) {
/* Save usermode pointers */
if (copy_from_user(&saved_ptr1, &lv_status_byname_req.lv->u.lv_current_pe, sizeof(void*)) != 0)
return -EFAULT;
if (copy_from_user(&saved_ptr2, &lv_status_byname_req.lv->u.lv_block_exception, sizeof(void*)) != 0)
return -EFAULT;
if (copy_to_user(lv_status_byname_req.lv,
lv_ptr,
sizeof(userlv_t)) != 0)
return -EFAULT;
if (saved_ptr1 != NULL) {
if (copy_to_user(saved_ptr1,
lv_ptr->u.lv_current_pe,
lv_ptr->u.lv_allocated_le *
sizeof(pe_t)) != 0)
return -EFAULT;
}
/* Restore usermode pointers */
if (copy_to_user(&lv_status_byname_req.lv->u.lv_current_pe, &saved_ptr1, sizeof(void*)) != 0)
return -EFAULT;
return 0;
}
}
return -ENXIO;
} /* lvm_do_lv_status_byname() */
/*
* character device support function logical volume status by index
*/
static int lvm_do_lv_status_byindex(vg_t *vg_ptr,void *arg)
{
lv_status_byindex_req_t lv_status_byindex_req;
void *saved_ptr1;
void *saved_ptr2;
lv_t *lv_ptr;
if (vg_ptr == NULL) return -ENXIO;
if (copy_from_user(&lv_status_byindex_req, arg,
sizeof(lv_status_byindex_req)) != 0)
return -EFAULT;
if (lv_status_byindex_req.lv == NULL)
return -EINVAL;
if (lv_status_byindex_req.lv_index <0 ||
lv_status_byindex_req.lv_index >= MAX_LV)
return -EINVAL;
if ( ( lv_ptr = vg_ptr->lv[lv_status_byindex_req.lv_index]) == NULL)
return -ENXIO;
/* Save usermode pointers */
if (copy_from_user(&saved_ptr1, &lv_status_byindex_req.lv->u.lv_current_pe, sizeof(void*)) != 0)
return -EFAULT;
if (copy_from_user(&saved_ptr2, &lv_status_byindex_req.lv->u.lv_block_exception, sizeof(void*)) != 0)
return -EFAULT;
if (copy_to_user(lv_status_byindex_req.lv, lv_ptr, sizeof(userlv_t)) != 0)
return -EFAULT;
if (saved_ptr1 != NULL) {
if (copy_to_user(saved_ptr1,
lv_ptr->u.lv_current_pe,
lv_ptr->u.lv_allocated_le *
sizeof(pe_t)) != 0)
return -EFAULT;
}
/* Restore usermode pointers */
if (copy_to_user(&lv_status_byindex_req.lv->u.lv_current_pe, &saved_ptr1, sizeof(void *)) != 0)
return -EFAULT;
return 0;
} /* lvm_do_lv_status_byindex() */
/*
* character device support function logical volume status by device number
*/
static int lvm_do_lv_status_bydev(vg_t * vg_ptr, void * arg) {
int l;
lv_status_bydev_req_t lv_status_bydev_req;
void *saved_ptr1;
void *saved_ptr2;
lv_t *lv_ptr;
if (vg_ptr == NULL) return -ENXIO;
if (copy_from_user(&lv_status_bydev_req, arg,
sizeof(lv_status_bydev_req)) != 0)
return -EFAULT;
for ( l = 0; l < vg_ptr->lv_max; l++) {
if ( vg_ptr->lv[l] == NULL) continue;
if ( kdev_same(vg_ptr->lv[l]->u.lv_dev,
to_kdev_t(lv_status_bydev_req.dev)))
break;
}
if ( l == vg_ptr->lv_max) return -ENXIO;
lv_ptr = vg_ptr->lv[l];
/* Save usermode pointers */
if (copy_from_user(&saved_ptr1, &lv_status_bydev_req.lv->u.lv_current_pe, sizeof(void*)) != 0)
return -EFAULT;
if (copy_from_user(&saved_ptr2, &lv_status_bydev_req.lv->u.lv_block_exception, sizeof(void*)) != 0)
return -EFAULT;
if (copy_to_user(lv_status_bydev_req.lv, lv_ptr, sizeof(lv_t)) != 0)
return -EFAULT;
if (saved_ptr1 != NULL) {
if (copy_to_user(saved_ptr1,
lv_ptr->u.lv_current_pe,
lv_ptr->u.lv_allocated_le *
sizeof(pe_t)) != 0)
return -EFAULT;
}
/* Restore usermode pointers */
if (copy_to_user(&lv_status_bydev_req.lv->u.lv_current_pe, &saved_ptr1, sizeof(void *)) != 0)
return -EFAULT;
return 0;
} /* lvm_do_lv_status_bydev() */
/*
* character device support function rename a logical volume
*/
static int lvm_do_lv_rename(vg_t *vg_ptr, lv_req_t *lv_req, userlv_t *ulv)
{
int l = 0;
int ret = 0;
lv_t *lv_ptr = NULL;
for (l = 0; l < vg_ptr->lv_max; l++)
{
if ( (lv_ptr = vg_ptr->lv[l]) == NULL) continue;
if (kdev_same(lv_ptr->u.lv_dev, ulv->lv_dev))
{
lvm_fs_remove_lv(vg_ptr, lv_ptr);
strncpy(lv_ptr->u.lv_name,
lv_req->lv_name,
NAME_LEN);
lvm_fs_create_lv(vg_ptr, lv_ptr);
break;
}
}
if (l == vg_ptr->lv_max) ret = -ENODEV;
return ret;
} /* lvm_do_lv_rename */
/*
* character device support function physical volume change
*/
static int lvm_do_pv_change(vg_t *vg_ptr, void *arg)
{
uint p;
pv_t *pv_ptr;
struct block_device *bd;
if (vg_ptr == NULL) return -ENXIO;
if (copy_from_user(&pv_change_req, arg,
sizeof(pv_change_req)) != 0)
return -EFAULT;
for (p = 0; p < vg_ptr->pv_max; p++) {
pv_ptr = vg_ptr->pv[p];
if (pv_ptr != NULL &&
strcmp(pv_ptr->pv_name,
pv_change_req.pv_name) == 0) {
bd = pv_ptr->bd;
if (copy_from_user(pv_ptr,
pv_change_req.pv,
sizeof(pv_t)) != 0)
return -EFAULT;
pv_ptr->bd = bd;
/* We don't need the PE list
in kernel space as with LVs pe_t list */
pv_ptr->pe = NULL;
return 0;
}
}
return -ENXIO;
} /* lvm_do_pv_change() */
/*
* character device support function get physical volume status
*/
static int lvm_do_pv_status(vg_t *vg_ptr, void *arg)
{
uint p;
pv_t *pv_ptr;
if (vg_ptr == NULL) return -ENXIO;
if (copy_from_user(&pv_status_req, arg,
sizeof(pv_status_req)) != 0)
return -EFAULT;
for (p = 0; p < vg_ptr->pv_max; p++) {
pv_ptr = vg_ptr->pv[p];
if (pv_ptr != NULL &&
strcmp(pv_ptr->pv_name,
pv_status_req.pv_name) == 0) {
if (copy_to_user(pv_status_req.pv,
pv_ptr,
sizeof(pv_t)) != 0)
return -EFAULT;
return 0;
}
}
return -ENXIO;
} /* lvm_do_pv_status() */
/*
* character device support function flush and invalidate all buffers of a PV
*/
static int lvm_do_pv_flush(void *arg)
{
pv_flush_req_t pv_flush_req;
if (copy_from_user(&pv_flush_req, arg,
sizeof(pv_flush_req)) != 0)
return -EFAULT;
fsync_dev(pv_flush_req.pv_dev);
invalidate_buffers(pv_flush_req.pv_dev);
return 0;
}
/*
* support function initialize gendisk variables
*/
static void __init lvm_geninit(struct gendisk *lvm_gdisk)
{
int i = 0;
#ifdef DEBUG_GENDISK
printk(KERN_DEBUG "%s -- lvm_gendisk\n", lvm_name);
#endif
for (i = 0; i < MAX_LV; i++) {
lvm_gendisk.part[i].start_sect = -1; /* avoid partition check */
lvm_size[i] = lvm_gendisk.part[i].nr_sects = 0;
lvm_blocksizes[i] = BLOCK_SIZE;
}
blk_size[MAJOR_NR] = lvm_size;
blksize_size[MAJOR_NR] = lvm_blocksizes;
return;
} /* lvm_gen_init() */
/* Must have down_write(_pe_lock) when we enqueue buffers */
static void _queue_io(struct bio *bh, int rw) {
if (bh->bi_next) BUG();
bh->bi_next = _pe_requests;
_pe_requests = bh;
}
/* Must have down_write(_pe_lock) when we dequeue buffers */
static struct bio *_dequeue_io(void)
{
struct bio *bh = _pe_requests;
_pe_requests = NULL;
return bh;
}
/*
* We do not need to hold _pe_lock to flush buffers. bh should be taken from
* _pe_requests under down_write(_pe_lock), and then _pe_requests can be set
* NULL and we drop _pe_lock. Any new buffers defered at this time will be
* added to a new list, and the old buffers can have their I/O restarted
* asynchronously.
*
* If, for some reason, the same PE is locked again before all of these writes
* have finished, then these buffers will just be re-queued (i.e. no danger).
*/
static void _flush_io(struct bio *bh)
{
while (bh) {
struct bio *next = bh->bi_next;
bh->bi_next = NULL;
/* resubmit this buffer head */
bh->bi_rw = WRITE; /* needed? */
generic_make_request(bh);
bh = next;
}
}
/*
* we must open the pv's before we use them
*/
static int _open_pv(pv_t *pv) {
int err;
struct block_device *bd;
if (!(bd = bdget(kdev_t_to_nr(pv->pv_dev))))
return -ENOMEM;
err = blkdev_get(bd, FMODE_READ|FMODE_WRITE, 0, BDEV_FILE);
if (err)
return err;
pv->bd = bd;
return 0;
}
static void _close_pv(pv_t *pv) {
if (pv) {
struct block_device *bdev = pv->bd;
pv->bd = NULL;
if (bdev)
blkdev_put(bdev, BDEV_FILE);
}
}
static unsigned long _sectors_to_k(unsigned long sect)
{
if(SECTOR_SIZE > 1024) {
return sect * (SECTOR_SIZE / 1024);
}
return sect / (1024 / SECTOR_SIZE);
}
module_init(lvm_init);
module_exit(lvm_cleanup);
MODULE_LICENSE("GPL");
#ifndef _LINUX_KDEV_T_H #ifndef _LINUX_KDEV_T_H
#define _LINUX_KDEV_T_H #define _LINUX_KDEV_T_H
#if defined(__KERNEL__) || defined(_LVM_H_INCLUDE) #ifdef __KERNEL__
/* /*
As a preparation for the introduction of larger device numbers, As a preparation for the introduction of larger device numbers,
we introduce a type kdev_t to hold them. No information about we introduce a type kdev_t to hold them. No information about
...@@ -136,7 +136,7 @@ static inline kdev_t to_kdev_t(int dev) ...@@ -136,7 +136,7 @@ static inline kdev_t to_kdev_t(int dev)
return mk_kdev(MAJOR(dev),MINOR(dev)); return mk_kdev(MAJOR(dev),MINOR(dev));
} }
#else /* __KERNEL__ || _LVM_H_INCLUDE */ #else /* __KERNEL__ */
/* /*
Some programs want their definitions of MAJOR and MINOR and MKDEV Some programs want their definitions of MAJOR and MINOR and MKDEV
...@@ -145,5 +145,5 @@ from the kernel sources. These must be the externally visible ones. ...@@ -145,5 +145,5 @@ from the kernel sources. These must be the externally visible ones.
#define MAJOR(dev) ((dev)>>8) #define MAJOR(dev) ((dev)>>8)
#define MINOR(dev) ((dev) & 0xff) #define MINOR(dev) ((dev) & 0xff)
#define MKDEV(ma,mi) ((ma)<<8 | (mi)) #define MKDEV(ma,mi) ((ma)<<8 | (mi))
#endif /* __KERNEL__ || _LVM_H_INCLUDE */ #endif /* __KERNEL__ */
#endif #endif
#ifndef _LINUX_LIST_H #ifndef _LINUX_LIST_H
#define _LINUX_LIST_H #define _LINUX_LIST_H
#if defined(__KERNEL__) || defined(_LVM_H_INCLUDE) #ifdef __KERNEL__
#include <linux/prefetch.h> #include <linux/prefetch.h>
#include <asm/system.h> #include <asm/system.h>
...@@ -319,6 +319,7 @@ static inline void list_splice_init(struct list_head *list, ...@@ -319,6 +319,7 @@ static inline void list_splice_init(struct list_head *list,
for (pos = (head)->next, n = pos->next; pos != (head); \ for (pos = (head)->next, n = pos->next; pos != (head); \
pos = n, ({ read_barrier_depends(); 0;}), n = pos->next) pos = n, ({ read_barrier_depends(); 0;}), n = pos->next)
#endif /* __KERNEL__ || _LVM_H_INCLUDE */ #else
#warning "don't include kernel headers in userspace"
#endif /* __KERNEL__ */
#endif #endif
/*
* include/linux/lvm.h
* kernel/lvm.h
* tools/lib/lvm.h
*
* Copyright (C) 1997 - 2000 Heinz Mauelshagen, Sistina Software
*
* February-November 1997
* May-July 1998
* January-March,July,September,October,Dezember 1999
* January,February,July,November 2000
* January 2001
*
* lvm is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2, or (at your option)
* any later version.
*
* lvm is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with GNU CC; see the file COPYING. If not, write to
* the Free Software Foundation, 59 Temple Place - Suite 330,
* Boston, MA 02111-1307, USA.
*
*/
/*
* Changelog
*
* 10/10/1997 - beginning of new structure creation
* 12/05/1998 - incorporated structures from lvm_v1.h and deleted lvm_v1.h
* 07/06/1998 - avoided LVM_KMALLOC_MAX define by using vmalloc/vfree
* instead of kmalloc/kfree
* 01/07/1998 - fixed wrong LVM_MAX_SIZE
* 07/07/1998 - extended pe_t structure by ios member (for statistic)
* 02/08/1998 - changes for official char/block major numbers
* 07/08/1998 - avoided init_module() and cleanup_module() to be static
* 29/08/1998 - seprated core and disk structure type definitions
* 01/09/1998 - merged kernel integration version (mike)
* 20/01/1999 - added LVM_PE_DISK_OFFSET macro for use in
* vg_read_with_pv_and_lv(), pv_move_pe(), pv_show_pe_text()...
* 18/02/1999 - added definition of time_disk_t structure for;
* keeps time stamps on disk for nonatomic writes (future)
* 15/03/1999 - corrected LV() and VG() macro definition to use argument
* instead of minor
* 03/07/1999 - define for genhd.c name handling
* 23/07/1999 - implemented snapshot part
* 08/12/1999 - changed LVM_LV_SIZE_MAX macro to reflect current 1TB limit
* 01/01/2000 - extended lv_v2 core structure by wait_queue member
* 12/02/2000 - integrated Andrea Arcagnelli's snapshot work
* 14/02/2001 - changed LVM_SNAPSHOT_MIN_CHUNK to 1 page
* 18/02/2000 - seperated user and kernel space parts by
* #ifdef them with __KERNEL__
* 08/03/2000 - implemented cluster/shared bits for vg_access
* 26/06/2000 - implemented snapshot persistency and resizing support
* 02/11/2000 - added hash table size member to lv structure
* 12/11/2000 - removed unneeded timestamp definitions
* 24/12/2000 - removed LVM_TO_{CORE,DISK}*, use cpu_{from, to}_le*
* instead - Christoph Hellwig
* 01/03/2001 - Rename VG_CREATE to VG_CREATE_OLD and add new VG_CREATE
* 08/03/2001 - new lv_t (in core) version number 5: changed page member
* to (struct kiobuf *) to use for COW exception table io
* 23/03/2001 - Change a (presumably) mistyped pv_t* to an lv_t*
* 26/03/2001 - changed lv_v4 to lv_v5 in structure definition [HM]
*
*/
#ifndef _LVM_H_INCLUDE
#define _LVM_H_INCLUDE
#define LVM_RELEASE_NAME "1.0.1-rc4(ish)"
#define LVM_RELEASE_DATE "03/10/2001"
#define _LVM_KERNEL_H_VERSION "LVM "LVM_RELEASE_NAME" ("LVM_RELEASE_DATE")"
#include <linux/version.h>
/*
* preprocessor definitions
*/
/* if you like emergency reset code in the driver */
#define LVM_TOTAL_RESET
#ifdef __KERNEL__
#undef LVM_HD_NAME /* display nice names in /proc/partitions */
/* lots of debugging output (see driver source)
#define DEBUG_LVM_GET_INFO
#define DEBUG
#define DEBUG_MAP
#define DEBUG_MAP_SIZE
#define DEBUG_IOCTL
#define DEBUG_READ
#define DEBUG_GENDISK
#define DEBUG_VG_CREATE
#define DEBUG_LVM_BLK_OPEN
#define DEBUG_KFREE
*/
#endif /* #ifdef __KERNEL__ */
#include <linux/kdev_t.h>
#include <linux/list.h>
#include <asm/types.h>
#include <linux/major.h>
#ifdef __KERNEL__
#include <linux/spinlock.h>
#include <asm/semaphore.h>
#endif /* #ifdef __KERNEL__ */
#include <asm/page.h>
#if !defined ( LVM_BLK_MAJOR) || !defined ( LVM_CHAR_MAJOR)
#error Bad include/linux/major.h - LVM MAJOR undefined
#endif
#ifdef BLOCK_SIZE
#undef BLOCK_SIZE
#endif
#ifdef CONFIG_ARCH_S390
#define BLOCK_SIZE 4096
#else
#define BLOCK_SIZE 1024
#endif
#ifndef SECTOR_SIZE
#define SECTOR_SIZE 512
#endif
/* structure version */
#define LVM_STRUCT_VERSION 1
#define LVM_DIR_PREFIX "/dev/"
/*
* i/o protocol version
*
* defined here for the driver and defined seperate in the
* user land tools/lib/liblvm.h
*
*/
#define LVM_DRIVER_IOP_VERSION 10
#define LVM_NAME "lvm"
#define LVM_GLOBAL "global"
#define LVM_DIR "lvm"
#define LVM_VG_SUBDIR "VGs"
#define LVM_LV_SUBDIR "LVs"
#define LVM_PV_SUBDIR "PVs"
/*
* VG/LV indexing macros
*/
/* character minor maps directly to volume group */
#define VG_CHR(a) ( a)
/* block minor indexes into a volume group/logical volume indirection table */
#define VG_BLK(a) ( vg_lv_map[a].vg_number)
#define LV_BLK(a) ( vg_lv_map[a].lv_number)
/*
* absolute limits for VGs, PVs per VG and LVs per VG
*/
#define ABS_MAX_VG 99
#define ABS_MAX_PV 256
#define ABS_MAX_LV 256 /* caused by 8 bit minor */
#define MAX_VG ABS_MAX_VG
#define MAX_LV ABS_MAX_LV
#define MAX_PV ABS_MAX_PV
#if ( MAX_VG > ABS_MAX_VG)
#undef MAX_VG
#define MAX_VG ABS_MAX_VG
#endif
#if ( MAX_LV > ABS_MAX_LV)
#undef MAX_LV
#define MAX_LV ABS_MAX_LV
#endif
/*
* LVM_PE_T_MAX corresponds to:
*
* 8KB PE size can map a ~512 MB logical volume at the cost of 1MB memory,
*
* 128MB PE size can map a 8TB logical volume at the same cost of memory.
*
* Default PE size of 4 MB gives a maximum logical volume size of 256 GB.
*
* Maximum PE size of 16GB gives a maximum logical volume size of 1024 TB.
*
* AFAIK, the actual kernels limit this to 1 TB.
*
* Should be a sufficient spectrum ;*)
*/
/* This is the usable size of pe_disk_t.le_num !!! v v */
#define LVM_PE_T_MAX ( ( 1 << ( sizeof ( uint16_t) * 8)) - 2)
#define LVM_LV_SIZE_MAX(a) ( ( long long) LVM_PE_T_MAX * (a)->pe_size > ( long long) 1024*1024/SECTOR_SIZE*1024*1024 ? ( long long) 1024*1024/SECTOR_SIZE*1024*1024 : ( long long) LVM_PE_T_MAX * (a)->pe_size)
#define LVM_MIN_PE_SIZE ( 8192L / SECTOR_SIZE) /* 8 KB in sectors */
#define LVM_MAX_PE_SIZE ( 16L * 1024L * 1024L / SECTOR_SIZE * 1024) /* 16GB in sectors */
#define LVM_DEFAULT_PE_SIZE ( 4096L * 1024 / SECTOR_SIZE) /* 4 MB in sectors */
#define LVM_DEFAULT_STRIPE_SIZE 16L /* 16 KB */
#define LVM_MIN_STRIPE_SIZE ( PAGE_SIZE/SECTOR_SIZE) /* PAGESIZE in sectors */
#define LVM_MAX_STRIPE_SIZE ( 512L * 1024 / SECTOR_SIZE) /* 512 KB in sectors */
#define LVM_MAX_STRIPES 128 /* max # of stripes */
#define LVM_MAX_SIZE ( 1024LU * 1024 / SECTOR_SIZE * 1024 * 1024) /* 1TB[sectors] */
#define LVM_MAX_MIRRORS 2 /* future use */
#define LVM_MIN_READ_AHEAD 2 /* minimum read ahead sectors */
#define LVM_MAX_READ_AHEAD 120 /* maximum read ahead sectors */
#define LVM_MAX_LV_IO_TIMEOUT 60 /* seconds I/O timeout (future use) */
#define LVM_PARTITION 0xfe /* LVM partition id */
#define LVM_NEW_PARTITION 0x8e /* new LVM partition id (10/09/1999) */
#define LVM_PE_SIZE_PV_SIZE_REL 5 /* max relation PV size and PE size */
#define LVM_SNAPSHOT_MAX_CHUNK 1024 /* 1024 KB */
#define LVM_SNAPSHOT_DEF_CHUNK 64 /* 64 KB */
#define LVM_SNAPSHOT_MIN_CHUNK (PAGE_SIZE/1024) /* 4 or 8 KB */
#define UNDEF -1
/*
* ioctls
* FIXME: the last parameter to _IO{W,R,WR} is a data type. The macro will
* expand this using sizeof(), so putting "1" there is misleading
* because sizeof(1) = sizeof(int) = sizeof(2) = 4 on a 32-bit machine!
*/
/* volume group */
#define VG_CREATE_OLD _IOW ( 0xfe, 0x00, 1)
#define VG_REMOVE _IOW ( 0xfe, 0x01, 1)
#define VG_EXTEND _IOW ( 0xfe, 0x03, 1)
#define VG_REDUCE _IOW ( 0xfe, 0x04, 1)
#define VG_STATUS _IOWR ( 0xfe, 0x05, 1)
#define VG_STATUS_GET_COUNT _IOWR ( 0xfe, 0x06, 1)
#define VG_STATUS_GET_NAMELIST _IOWR ( 0xfe, 0x07, 1)
#define VG_SET_EXTENDABLE _IOW ( 0xfe, 0x08, 1)
#define VG_RENAME _IOW ( 0xfe, 0x09, 1)
/* Since 0.9beta6 */
#define VG_CREATE _IOW ( 0xfe, 0x0a, 1)
/* logical volume */
#define LV_CREATE _IOW ( 0xfe, 0x20, 1)
#define LV_REMOVE _IOW ( 0xfe, 0x21, 1)
#define LV_ACTIVATE _IO ( 0xfe, 0x22)
#define LV_DEACTIVATE _IO ( 0xfe, 0x23)
#define LV_EXTEND _IOW ( 0xfe, 0x24, 1)
#define LV_REDUCE _IOW ( 0xfe, 0x25, 1)
#define LV_STATUS_BYNAME _IOWR ( 0xfe, 0x26, 1)
#define LV_STATUS_BYINDEX _IOWR ( 0xfe, 0x27, 1)
#define LV_SET_ACCESS _IOW ( 0xfe, 0x28, 1)
#define LV_SET_ALLOCATION _IOW ( 0xfe, 0x29, 1)
#define LV_SET_STATUS _IOW ( 0xfe, 0x2a, 1)
#define LE_REMAP _IOW ( 0xfe, 0x2b, 1)
#define LV_SNAPSHOT_USE_RATE _IOWR ( 0xfe, 0x2c, 1)
#define LV_STATUS_BYDEV _IOWR ( 0xfe, 0x2e, 1)
#define LV_RENAME _IOW ( 0xfe, 0x2f, 1)
#define LV_BMAP _IOWR ( 0xfe, 0x30, 1)
/* physical volume */
#define PV_STATUS _IOWR ( 0xfe, 0x40, 1)
#define PV_CHANGE _IOWR ( 0xfe, 0x41, 1)
#define PV_FLUSH _IOW ( 0xfe, 0x42, 1)
/* physical extent */
#define PE_LOCK_UNLOCK _IOW ( 0xfe, 0x50, 1)
/* i/o protocol version */
#define LVM_GET_IOP_VERSION _IOR ( 0xfe, 0x98, 1)
#ifdef LVM_TOTAL_RESET
/* special reset function for testing purposes */
#define LVM_RESET _IO ( 0xfe, 0x99)
#endif
/* lock the logical volume manager */
#define LVM_LOCK_LVM _IO ( 0xfe, 0x100)
/* END ioctls */
/*
* Status flags
*/
/* volume group */
#define VG_ACTIVE 0x01 /* vg_status */
#define VG_EXPORTED 0x02 /* " */
#define VG_EXTENDABLE 0x04 /* " */
#define VG_READ 0x01 /* vg_access */
#define VG_WRITE 0x02 /* " */
#define VG_CLUSTERED 0x04 /* " */
#define VG_SHARED 0x08 /* " */
/* logical volume */
#define LV_ACTIVE 0x01 /* lv_status */
#define LV_SPINDOWN 0x02 /* " */
#define LV_READ 0x01 /* lv_access */
#define LV_WRITE 0x02 /* " */
#define LV_SNAPSHOT 0x04 /* " */
#define LV_SNAPSHOT_ORG 0x08 /* " */
#define LV_BADBLOCK_ON 0x01 /* lv_badblock */
#define LV_STRICT 0x01 /* lv_allocation */
#define LV_CONTIGUOUS 0x02 /* " */
/* physical volume */
#define PV_ACTIVE 0x01 /* pv_status */
#define PV_ALLOCATABLE 0x02 /* pv_allocatable */
/* misc */
#define LVM_SNAPSHOT_DROPPED_SECTOR 1
/*
* Structure definitions core/disk follow
*
* conditional conversion takes place on big endian architectures
* in functions * pv_copy_*(), vg_copy_*() and lv_copy_*()
*
*/
#define NAME_LEN 128 /* don't change!!! */
#define UUID_LEN 32 /* don't change!!! */
/* copy on write tables in disk format */
typedef struct lv_COW_table_disk_v1 {
uint64_t pv_org_number;
uint64_t pv_org_rsector;
uint64_t pv_snap_number;
uint64_t pv_snap_rsector;
} lv_COW_table_disk_t;
/* remap physical sector/rdev pairs including hash */
typedef struct lv_block_exception_v1 {
struct list_head hash;
uint32_t rsector_org;
kdev_t rdev_org;
uint32_t rsector_new;
kdev_t rdev_new;
} lv_block_exception_t;
/* disk stored pe information */
typedef struct {
uint16_t lv_num;
uint16_t le_num;
} pe_disk_t;
/* disk stored PV, VG, LV and PE size and offset information */
typedef struct {
uint32_t base;
uint32_t size;
} lvm_disk_data_t;
/*
* physical volume structures
*/
/* core */
typedef struct pv_v2 {
char id[2]; /* Identifier */
unsigned short version; /* HM lvm version */
lvm_disk_data_t pv_on_disk;
lvm_disk_data_t vg_on_disk;
lvm_disk_data_t pv_uuidlist_on_disk;
lvm_disk_data_t lv_on_disk;
lvm_disk_data_t pe_on_disk;
char pv_name[NAME_LEN];
char vg_name[NAME_LEN];
char system_id[NAME_LEN]; /* for vgexport/vgimport */
kdev_t pv_dev;
uint pv_number;
uint pv_status;
uint pv_allocatable;
uint pv_size; /* HM */
uint lv_cur;
uint pe_size;
uint pe_total;
uint pe_allocated;
uint pe_stale; /* for future use */
pe_disk_t *pe; /* HM */
struct block_device *bd;
char pv_uuid[UUID_LEN+1];
#ifndef __KERNEL__
uint32_t pe_start; /* in sectors */
#endif
} pv_t;
/* disk */
typedef struct pv_disk_v2 {
uint8_t id[2]; /* Identifier */
uint16_t version; /* HM lvm version */
lvm_disk_data_t pv_on_disk;
lvm_disk_data_t vg_on_disk;
lvm_disk_data_t pv_uuidlist_on_disk;
lvm_disk_data_t lv_on_disk;
lvm_disk_data_t pe_on_disk;
uint8_t pv_uuid[NAME_LEN];
uint8_t vg_name[NAME_LEN];
uint8_t system_id[NAME_LEN]; /* for vgexport/vgimport */
uint32_t pv_major;
uint32_t pv_number;
uint32_t pv_status;
uint32_t pv_allocatable;
uint32_t pv_size; /* HM */
uint32_t lv_cur;
uint32_t pe_size;
uint32_t pe_total;
uint32_t pe_allocated;
/* new in struct version 2 */
uint32_t pe_start; /* in sectors */
} pv_disk_t;
/*
* Structures for Logical Volume (LV)
*/
/* core PE information */
typedef struct {
kdev_t dev;
uint32_t pe; /* to be changed if > 2TB */
uint32_t reads;
uint32_t writes;
} pe_t;
typedef struct {
char lv_name[NAME_LEN];
kdev_t old_dev;
kdev_t new_dev;
uint32_t old_pe;
uint32_t new_pe;
} le_remap_req_t;
typedef struct lv_bmap {
uint32_t lv_block;
dev_t lv_dev;
} lv_bmap_t;
/*
* fixme...
*/
#define LVM_MAX_ATOMIC_IO 512
#define LVM_MAX_SECTORS (LVM_MAX_ATOMIC_IO * 2)
/*
* Structure Logical Volume (LV) Version 3
*/
struct kern_lv_v5;
struct user_lv_v5;
typedef struct user_lv_v5 userlv_t;
#ifdef __KERNEL__
typedef struct kern_lv_v5 lv_t;
#else
typedef struct user_lv_v5 lv_t;
#endif
struct user_lv_v5 {
char lv_name[NAME_LEN];
char vg_name[NAME_LEN];
uint lv_access;
uint lv_status;
uint lv_open; /* HM */
kdev_t lv_dev; /* HM */
uint lv_number; /* HM */
uint lv_mirror_copies; /* for future use */
uint lv_recovery; /* " */
uint lv_schedule; /* " */
uint lv_size;
pe_t *lv_current_pe; /* HM */
uint lv_current_le; /* for future use */
uint lv_allocated_le;
uint lv_stripes;
uint lv_stripesize;
uint lv_badblock; /* for future use */
uint lv_allocation;
uint lv_io_timeout; /* for future use */
uint lv_read_ahead;
/* delta to version 1 starts here */
lv_t *lv_snapshot_org;
lv_t *lv_snapshot_prev;
lv_t *lv_snapshot_next;
lv_block_exception_t *lv_block_exception;
uint lv_remap_ptr;
uint lv_remap_end;
uint lv_chunk_size;
uint lv_snapshot_minor;
};
struct kern_lv_v5{
struct user_lv_v5 u;
struct kiobuf *lv_iobuf;
sector_t blocks[LVM_MAX_SECTORS];
struct kiobuf *lv_COW_table_iobuf;
struct rw_semaphore lv_lock;
struct list_head *lv_snapshot_hash_table;
uint32_t lv_snapshot_hash_table_size;
uint32_t lv_snapshot_hash_mask;
wait_queue_head_t lv_snapshot_wait;
int lv_snapshot_use_rate;
struct vg_v3 *vg;
uint lv_allocated_snapshot_le;
};
/* disk */
typedef struct lv_disk_v3 {
uint8_t lv_name[NAME_LEN];
uint8_t vg_name[NAME_LEN];
uint32_t lv_access;
uint32_t lv_status;
uint32_t lv_open; /* HM */
uint32_t lv_dev; /* HM */
uint32_t lv_number; /* HM */
uint32_t lv_mirror_copies; /* for future use */
uint32_t lv_recovery; /* " */
uint32_t lv_schedule; /* " */
uint32_t lv_size;
uint32_t lv_snapshot_minor;/* minor number of original */
uint16_t lv_chunk_size; /* chunk size of snapshot */
uint16_t dummy;
uint32_t lv_allocated_le;
uint32_t lv_stripes;
uint32_t lv_stripesize;
uint32_t lv_badblock; /* for future use */
uint32_t lv_allocation;
uint32_t lv_io_timeout; /* for future use */
uint32_t lv_read_ahead; /* HM */
} lv_disk_t;
/*
* Structure Volume Group (VG) Version 1
*/
/* core */
typedef struct vg_v3 {
char vg_name[NAME_LEN]; /* volume group name */
uint vg_number; /* volume group number */
uint vg_access; /* read/write */
uint vg_status; /* active or not */
uint lv_max; /* maximum logical volumes */
uint lv_cur; /* current logical volumes */
uint lv_open; /* open logical volumes */
uint pv_max; /* maximum physical volumes */
uint pv_cur; /* current physical volumes FU */
uint pv_act; /* active physical volumes */
uint dummy; /* was obsolete max_pe_per_pv */
uint vgda; /* volume group descriptor arrays FU */
uint pe_size; /* physical extent size in sectors */
uint pe_total; /* total of physical extents */
uint pe_allocated; /* allocated physical extents */
uint pvg_total; /* physical volume groups FU */
struct proc_dir_entry *proc;
pv_t *pv[ABS_MAX_PV + 1]; /* physical volume struct pointers */
lv_t *lv[ABS_MAX_LV + 1]; /* logical volume struct pointers */
char vg_uuid[UUID_LEN+1]; /* volume group UUID */
#ifdef __KERNEL__
struct proc_dir_entry *vg_dir_pde;
struct proc_dir_entry *lv_subdir_pde;
struct proc_dir_entry *pv_subdir_pde;
#else
char dummy1[200];
#endif
} vg_t;
/* disk */
typedef struct vg_disk_v2 {
uint8_t vg_uuid[UUID_LEN]; /* volume group UUID */
uint8_t vg_name_dummy[NAME_LEN-UUID_LEN]; /* rest of v1 VG name */
uint32_t vg_number; /* volume group number */
uint32_t vg_access; /* read/write */
uint32_t vg_status; /* active or not */
uint32_t lv_max; /* maximum logical volumes */
uint32_t lv_cur; /* current logical volumes */
uint32_t lv_open; /* open logical volumes */
uint32_t pv_max; /* maximum physical volumes */
uint32_t pv_cur; /* current physical volumes FU */
uint32_t pv_act; /* active physical volumes */
uint32_t dummy;
uint32_t vgda; /* volume group descriptor arrays FU */
uint32_t pe_size; /* physical extent size in sectors */
uint32_t pe_total; /* total of physical extents */
uint32_t pe_allocated; /* allocated physical extents */
uint32_t pvg_total; /* physical volume groups FU */
} vg_disk_t;
/*
* Request structures for ioctls
*/
/* Request structure PV_STATUS_BY_NAME... */
typedef struct {
char pv_name[NAME_LEN];
pv_t *pv;
} pv_status_req_t, pv_change_req_t;
/* Request structure PV_FLUSH */
typedef struct {
char pv_name[NAME_LEN];
kdev_t pv_dev;
} pv_flush_req_t;
/* Request structure PE_MOVE */
typedef struct {
enum {
LOCK_PE, UNLOCK_PE
} lock;
struct {
kdev_t lv_dev;
kdev_t pv_dev;
uint32_t pv_offset;
} data;
} pe_lock_req_t;
/* Request structure LV_STATUS_BYNAME */
typedef struct {
char lv_name[NAME_LEN];
lv_t *lv;
} lv_status_byname_req_t, lv_req_t;
/* Request structure LV_STATUS_BYINDEX */
typedef struct {
uint32_t lv_index;
lv_t *lv;
/* Transfer size because user space and kernel space differ */
ushort size;
} lv_status_byindex_req_t;
/* Request structure LV_STATUS_BYDEV... */
typedef struct {
dev_t dev;
lv_t *lv;
} lv_status_bydev_req_t;
/* Request structure LV_SNAPSHOT_USE_RATE */
typedef struct {
int block;
int rate;
} lv_snapshot_use_rate_req_t;
/* useful inlines */
static inline ulong round_up(ulong n, ulong size) {
size--;
return (n + size) & ~size;
}
static inline ulong div_up(ulong n, ulong size) {
return round_up(n, size) / size;
}
static int inline LVM_GET_COW_TABLE_CHUNKS_PER_PE(vg_t *vg, lv_t *lv) {
return vg->pe_size / lv->u.lv_chunk_size;
}
static int inline LVM_GET_COW_TABLE_ENTRIES_PER_PE(vg_t *vg, lv_t *lv) {
ulong chunks = vg->pe_size / lv->u.lv_chunk_size;
ulong entry_size = sizeof(lv_COW_table_disk_t);
ulong chunk_size = lv->u.lv_chunk_size * SECTOR_SIZE;
ulong entries = (vg->pe_size * SECTOR_SIZE) /
(entry_size + chunk_size);
if(chunks < 2)
return 0;
for(; entries; entries--)
if((div_up(entries * entry_size, chunk_size) + entries) <=
chunks)
break;
return entries;
}
#endif /* #ifndef _LVM_H_INCLUDE */
...@@ -85,8 +85,6 @@ ...@@ -85,8 +85,6 @@
#define IDE4_MAJOR 56 #define IDE4_MAJOR 56
#define IDE5_MAJOR 57 #define IDE5_MAJOR 57
#define LVM_BLK_MAJOR 58 /* Logical Volume Manager */
#define SCSI_DISK1_MAJOR 65 #define SCSI_DISK1_MAJOR 65
#define SCSI_DISK2_MAJOR 66 #define SCSI_DISK2_MAJOR 66
#define SCSI_DISK3_MAJOR 67 #define SCSI_DISK3_MAJOR 67
...@@ -138,8 +136,6 @@ ...@@ -138,8 +136,6 @@
#define PHONE_MAJOR 100 #define PHONE_MAJOR 100
#define LVM_CHAR_MAJOR 109 /* Logical Volume Manager */
#define RTF_MAJOR 150 #define RTF_MAJOR 150
#define RAW_MAJOR 162 #define RAW_MAJOR 162
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment