- 24 Jun, 2010 12 commits
-
-
NeilBrown authored
There are few situations where it would make any sense to add a spare when reducing the number of devices in an array, but it is conceivable: A 6 drive RAID6 with two missing devices could be reshaped to a 5 drive RAID6, and a spare could become available just in time for the reshape, but not early enough to have been recovered first. 'freezing' recovery can make this easy to do without any races. However doing such a thing is a bad idea. md will not record the partially-recovered state of the 'spare' and when the reshape finished it will think that the spare is still spare. Easiest way to avoid this confusion is to simply disallow it. Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
As the comment says, the tail of this loop only applies to devices that are not fully in sync, so if In_sync was set, we should avoid the rest of the loop. This bug will hardly ever cause an actual problem. The worst it can do is allow an array to be assembled that is dirty and degraded, which is not generally a good idea (without warning the sysadmin first). This will only happen if the array is RAID4 or a RAID5/6 in an intermediate state during a reshape and so has one drive that is all 'parity' - no data - while some other device has failed. This is certainly possible, but not at all common. Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
During a recovery of reshape the early part of some devices might be in-sync while the later parts are not. We we know we are looking at an early part it is good to treat that part as in-sync for stripe calculations. This is particularly important for a reshape which suffers device failure. Treating the data as in-sync can mean the difference between data-safety and data-loss. Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
When we are reshaping an array, the device failure combinations that cause us to decide that the array as failed are more subtle. In particular, any 'spare' will be fully in-sync in the section of the array that has already been reshaped, thus failures that affect only that section are less critical. So encode this subtlety in a new function and call it as appropriate. The case that showed this problem was a 4 drive RAID5 to 8 drive RAID6 conversion where the last two devices failed. This resulted in: good good good good incomplete good good failed failed while converting a 5-drive RAID6 to 8 drive RAID5 The incomplete device causes the whole array to look bad, bad as it was actually good for the section that had been converted to 8-drives, all the data was actually safe. Reported-by: Terry Morris <tbmorris@tbmorris.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
When an array is reshaped to have fewer devices, the reshape proceeds from the end of the devices to the beginning. If a device happens to be non-In_sync (which is possible but rare) we would normally update the ->recovery_offset as the reshape progresses. However that would be wrong as the recover_offset records that the early part of the device is in_sync, while in fact it would only be the later part that is in_sync, and in any case the offset number would be measured from the wrong end of the device. Relatedly, if after a reshape a spare is discovered to not be recoverred all the way to the end, not allow spare_active to incorporate it in the array. This becomes relevant in the following sample scenario: A 4 drive RAID5 is converted to a 6 drive RAID6 in a combined operation. The RAID5->RAID6 conversion will cause a 5 drive to be included as a spare, then the 5drive -> 6drive reshape will effectively rebuild that spare as it progresses. The 6th drive is treated as in_sync the whole time as there is never any case that we might consider reading from it, but must not because there is no valid data. If we interrupt this reshape part-way through and reverse it to return to a 5-drive RAID6 (or event a 4-drive RAID5), we don't want to update the recovery_offset - as that would be wrong - and we don't want to include that spare as active in the 5-drive RAID6 when the reversed reshape completed and it will be mostly out-of-sync still. Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
The entries in the stripe_cache maintained by raid5 are enlarged when we increased the number of devices in the array, but not shrunk when we reduce the number of devices. So if entries are added after reducing the number of devices, we much ensure to initialise the whole entry, not just the part that is currently relevant. Otherwise if we enlarge the array again, we will reference uninitialised values. As grow_buffers/shrink_buffer now want to use a count that is stored explicity in the raid_conf, they should get it from there rather than being passed it as a parameter. Signed-off-by: NeilBrown <neilb@suse.de>
-
Maciej Trela authored
Only level 5 with layout=PARITY_N can be taken over to raid0 now. Lets allow level 4 either. Signed-off-by: Maciej Trela <maciej.trela@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
Maciej Trela authored
After takeover from raid5/10 -> raid0 mddev->layout is not cleared. Signed-off-by: Maciej Trela <maciej.trela@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
Maciej Trela authored
Use mddev->new_layout in setup_conf. Also use new_chunk, and don't set ->degraded in takeover(). That gets set in run() Signed-off-by: Maciej Trela <maciej.trela@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
Most array level changes leave the list of devices largely unchanged, possibly causing one at the end to become redundant. However conversions between RAID0 and RAID10 need to renumber all devices (except 0). This renumbering is currently being done in the ->run method when the new personality takes over. However this is too late as the common code in md.c might already have invalidated some of the devices if they had a ->raid_disk number that appeared to high. Moving it into the ->takeover method is too early as the array is still active at that time and wrong ->raid_disk numbers could cause confusion. So add a ->new_raid_disk field to mdk_rdev_s and use it to communicate the new raid_disk number. Now the common code knows exactly which devices need to be renumbered, and which can be invalidated, and can do it all at a convenient time when the array is suspend. It can also update some symlinks in sysfs which previously were not be updated correctly. Reported-by: Maciej Trela <maciej.trela@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
Prasanna S. Panchamukhi authored
Such NULL pointer dereference can occur when the driver was fixing the read errors/bad blocks and the disk was physically removed causing a system crash. This patch check if the rcu_dereference() returns valid rdev before accessing it in fix_read_error(). Cc: stable@kernel.org Signed-off-by: Prasanna S. Panchamukhi <prasanna.panchamukhi@riverbed.com> Signed-off-by: Rob Becker <rbecker@riverbed.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
Commit b821eaa5 broke partition detection for md arrays. The logic was almost right. However if revalidate_disk is called when the device is not yet open, bdev->bd_disk won't be set, so the flush_disk() Call will not set bd_invalidated. So when md_open is called we still need to ensure that ->bd_invalidated gets set. This is easily done with a call to check_disk_size_change in the place where the offending commit removed check_disk_change. At the important times, the size will have changed from 0 to non-zero, so check_disk_size_change will set bd_invalidated. Tested-by: Duncan <1i5t5.duncan@cox.net> Reported-by: Duncan <1i5t5.duncan@cox.net> Signed-off-by: NeilBrown <neilb@suse.de>
-
- 12 Jun, 2010 1 commit
-
-
Linus Torvalds authored
-
- 11 Jun, 2010 27 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: wimax/i2400m: fix missing endian correction read in fw loader net8139: fix a race at the end of NAPI pktgen: Fix accuracy of inter-packet delay. pkt_sched: gen_estimator: add a new lock net: deliver skbs on inactive slaves to exact matches ipv6: fix ICMP6_MIB_OUTERRORS r8169: fix mdio_read and update mdio_write according to hw specs gianfar: Revive the driver for eTSEC devices (disable timestamping) caif: fix a couple range checks phylib: Add support for the LXT973 phy. net: Print num_rx_queues imbalance warning only when there are allocated queues
-
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6Linus Torvalds authored
* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: PM / x86: Save/restore MISC_ENABLE register
-
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstableLinus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: The file argument for fsync() is never null Btrfs: handle ERR_PTR from posix_acl_from_xattr() Btrfs: avoid BUG when dropping root and reference in same transaction Btrfs: prohibit a operation of changing acl's mask when noacl mount option used Btrfs: should add a permission check for setfacl Btrfs: btrfs_lookup_dir_item() can return ERR_PTR Btrfs: btrfs_read_fs_root_no_name() returns ERR_PTRs Btrfs: unwind after btrfs_start_transaction() errors Btrfs: btrfs_iget() returns ERR_PTR Btrfs: handle kzalloc() failure in open_ctree() Btrfs: handle error returns from btrfs_lookup_dir_item() Btrfs: Fix BUG_ON for fs converted from extN Btrfs: Fix null dereference in relocation.c Btrfs: fix remap_file_pages error Btrfs: uninitialized data is check_path_shared() Btrfs: fix fallocate regression Btrfs: fix loop device on top of btrfs
-
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6Linus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: clear bridge resource range if BIOS assigned bad one PCI: hotplug/cpqphp, fix NULL dereference Revert "PCI: create function symlinks in /sys/bus/pci/slots/N/" PCI: change resource collision messages from KERN_ERR to KERN_INFO
-
Yinghai Lu authored
Yannick found that video does not work with 2.6.34. The cause of this bug was that the BIOS had assigned the wrong range to the PCI bridge above the video device. Before 2.6.34 the kernel would have shrunk the size of the bridge window, but since d65245c3 PCI: don't shrink bridge resources the kernel will avoid shrinking BIOS ranges. So zero out the old range if we fail to claim it at boot time; this will cause us to allocate a new range at startup, restoring the 2.6.34 behavior. Fixes regression https://bugzilla.kernel.org/show_bug.cgi?id=16009. Reported-by: Yannick <yannick.roehlly@free.fr> Acked-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
-
Jiri Slaby authored
There are devices out there which are PCI Hot-plug controllers with compaq PCI IDs, but are not bridges, hence have pdev->subordinate NULL. But cpqphp expects the pointer to be non-NULL. Add a check to the probe function to avoid oopses like: BUG: unable to handle kernel NULL pointer dereference at 00000050 IP: [<f82e3c41>] cpqhpc_probe+0x951/0x1120 [cpqphp] *pdpt = 0000000033779001 *pde = 0000000000000000 ... The device here was: 00:0b.0 PCI Hot-plug controller [0804]: Compaq Computer Corporation PCI Hotplug Controller [0e11:a0f7] (rev 11) Subsystem: Compaq Computer Corporation Device [0e11:a2f8] Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Greg KH <greg@kroah.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
-
Jesse Barnes authored
This reverts commit 75568f80. Since they're just a convenience anyway, remove these symlinks since they're causing duplicate filename errors in the wild. Acked-by: Alex Chiang <achiang@canonical.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
-
Bjorn Helgaas authored
We can often deal with PCI resource issues by moving devices around. In that case, there's no point in alarming the user with messages like these. There are many bug reports where the message itself is the only problem, e.g., https://bugs.launchpad.net/ubuntu/+source/linux/+bug/413419 . Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
-
Dan Carpenter authored
The "file" argument for fsync is never null so we can remove this check. What drew my attention here is that 7ea80859: "drop unused dentry argument to ->fsync" introduced an unconditional dereference at the start of the function and that generated a smatch warning. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
-
Dan Carpenter authored
posix_acl_from_xattr() returns both ERR_PTRs and null, but it's OK to pass null values to set_cached_acl() Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
-
Sage Weil authored
If btrfs_ioctl_snap_destroy() deletes a snapshot but finishes with end_transaction(), the cleaner kthread may come in and drop the root in the same transaction. If that's the case, the root's refs still == 1 in the tree when btrfs_del_root() deletes the item, because commit_fs_roots() hasn't updated it yet (that happens during the commit). This wasn't a problem before only because btrfs_ioctl_snap_destroy() would commit the transaction before dropping the dentry reference, so the dead root wouldn't get queued up until after the fs root item was updated in the btree. Since it is not an error to drop the root reference and the root in the same transaction, just drop the BUG_ON() in btrfs_del_root(). Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Chris Mason <chris.mason@oracle.com>
-
Shi Weihua authored
when used Posix File System Test Suite(pjd-fstest) to test btrfs, some cases about setfacl failed when noacl mount option used. I simplified used commands in pjd-fstest, and the following steps can reproduce it. ------------------------ # cd btrfs-part/ # mkdir aaa # setfacl -m m::rw aaa <- successed, but not expected by pjd-fstest. ------------------------ I checked ext3, a warning message occured, like as: setfacl: aaa/: Operation not supported Certainly, it's expected by pjd-fstest. So, i compared acl.c of btrfs and ext3. Based on that, a patch created. Fortunately, it works. Signed-off-by: Shi Weihua <shiwh@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
-
Shi Weihua authored
On btrfs, do the following ------------------ # su user1 # cd btrfs-part/ # touch aaa # getfacl aaa # file: aaa # owner: user1 # group: user1 user::rw- group::rw- other::r-- # su user2 # cd btrfs-part/ # setfacl -m u::rwx aaa # getfacl aaa # file: aaa # owner: user1 # group: user1 user::rwx <- successed to setfacl group::rw- other::r-- ------------------ but we should prohibit it that user2 changing user1's acl. In fact, on ext3 and other fs, a message occurs: setfacl: aaa: Operation not permitted This patch fixed it. Signed-off-by: Shi Weihua <shiwh@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
-
Dan Carpenter authored
btrfs_lookup_dir_item() can return either ERR_PTRs or null. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
-
Dan Carpenter authored
btrfs_read_fs_root_no_name() returns ERR_PTRs on error so I added a check for that. It's not clear to me if it can also return NULL pointers or not so I left the original NULL pointer check as is. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
-
Dan Carpenter authored
This was added by a22285a6: "Btrfs: Integrate metadata reservation with start_transaction". If we goto out here then we skip all the unwinding and there are locks still held etc. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
-
Dan Carpenter authored
btrfs_iget() returns an ERR_PTR() on failure and not null. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
-
Dan Carpenter authored
Unwind and return -ENOMEM if the allocation fails here. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
-
Dan Carpenter authored
If btrfs_lookup_dir_item() fails, we should can just let the mount fail with an error. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
-
Yan, Zheng authored
Tree blocks can live in data block groups in FS converted from extN. So it's easy to trigger the BUG_ON. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
-
Yan, Zheng authored
Fix a potential null dereference in relocation.c Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Acked-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
-
-
Inaky Perez-Gonzalez authored
i2400m_fw_hdr_check() was accessing hardware field bcf_hdr->module_type (little endian 32) without converting to host byte sex. Reported-by: Данилин Михаил <mdanilin@nsg.net.ru> Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6Linus Torvalds authored
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6: kbuild: Create output directory in Makefile.modbuiltin kbuild: Generate modules.builtin in make modules
-
git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6Linus Torvalds authored
* 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6: pcmcia: avoid validate_cis failure on CIS override pcmcia: dev_node removal bugfix pcmcia: yenta_socket.c Remove extra #ifdef CONFIG_YENTA_TI pcmcia: only keep saved I365_CSCINT flag if there is no PCI irq
-
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-clientLinus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: try to send partial cap release on cap message on missing inode ceph: release cap on import if we don't have the inode ceph: fix misleading/incorrect debug message ceph: fix atomic64_t initialization on ia64 ceph: fix lease revocation when seq doesn't match ceph: fix f_namelen reported by statfs ceph: fix memory leak in statfs ceph: fix d_subdirs ordering problem
-
Miao Xie authored
when we use remap_file_pages() to remap a file, remap_file_pages always return error. It is because btrfs didn't set VM_CAN_NONLINEAR for vma. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
-