- 06 Jan, 2009 13 commits
-
-
Mikulas Patocka authored
Rework table reference counting. The existing code uses a reference counter. When the last reference is dropped and the counter reaches zero, the table destructor is called. Table reference counters are acquired/released from upcalls from other kernel code (dm_any_congested, dm_merge_bvec, dm_unplug_all). If the reference counter reaches zero in one of the upcalls, the table destructor is called from almost random kernel code. This leads to various problems: * dm_any_congested being called under a spinlock, which calls the destructor, which calls some sleeping function. * the destructor attempting to take a lock that is already taken by the same process. * stale reference from some other kernel code keeps the table constructed, which keeps some devices open, even after successful return from "dmsetup remove". This can confuse lvm and prevent closing of underlying devices or reusing device minor numbers. The patch changes reference counting so that the table destructor can be called only at predetermined places. The table has always exactly one reference from either mapped_device->map or hash_cell->new_map. After this patch, this reference is not counted in table->holders. A pair of dm_create_table/dm_destroy_table functions is used for table creation/destruction. Temporary references from the other code increase table->holders. A pair of dm_table_get/dm_table_put functions is used to manipulate it. When the table is about to be destroyed, we wait for table->holders to reach 0. Then, we call the table destructor. We use active waiting with msleep(1), because the situation happens rarely (to one user in 5 years) and removing the device isn't performance-critical task: the user doesn't care if it takes one tick more or not. This way, the destructor is called only at specific points (dm_table_destroy function) and the above problems associated with lazy destruction can't happen. Finally remove the temporary protection added to dm_any_congested(). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Andi Kleen authored
Implement barrier support for single device DM devices This patch implements barrier support in DM for the common case of dm linear just remapping a single underlying device. In this case we can safely pass the barrier through because there can be no reordering between devices. NB. Any DM device might cease to support barriers if it gets reconfigured so code must continue to allow for a possible -EOPNOTSUPP on every barrier bio submitted. - agk Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Kiyoshi Ueda authored
This patch adds the following target interfaces for request-based dm. map_rq : for mapping a request rq_end_io : for finishing a request busy : for avoiding performance regression from bio-based dm. Target can tell dm core not to map requests now, and that may help requests in the block layer queue to be bigger by I/O merging. In bio-based dm, this behavior is done by device drivers managing the block layer queue. But in request-based dm, dm core has to do that since dm core manages the block layer queue. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Kiyoshi Ueda authored
This patch prepares some kmem_caches for request-based dm. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Milan Broz authored
Allow NULL buffer in dm_copy_name_and_uuid if you only want to return one of the fields. (Required by a following patch that adds these fields to sysfs.) Signed-off-by: Milan Broz <mbroz@redhat.com> Reviewed-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Milan Broz authored
Check that the log bitmap will fit within the log device. Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Milan Broz authored
Move log size validation from mirror target to log constructor. Removed PAGE_SIZE restriction we no longer think necessary. Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Takahiro Yasui authored
rw_header function updates three members of io_req data every time when I/O is processed. bi_rw and notify.fn are never modified once they get initialized, and so they can be set in advance. header_to_disk() can also be pulled out of write_header() since only one caller needs it and write_header() can be replaced by rw_header() directly. Signed-off-by: Takahiro Yasui <tyasui@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Change dm_unregister_target to return void and use BUG() for error reporting. dm_unregister_target can only fail because of programming bug in the target driver. It can't fail because of user's behavior or disk errors. This patch changes unregister_target to return void and use BUG if someone tries to unregister non-registered target or unregister target that is in use. This patch removes code duplication (testing of error codes in all dm targets) and reports bugs in just one place, in dm_unregister_target. In some target drivers, these return codes were ignored, which could lead to a situation where bugs could be missed. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Jonathan Brassow authored
Always increase the error count when I/O on a leg of a mirror fails. The error count is used to decide whether to select an alternative mirror leg. If the target doesn't use the "handle_errors" feature, the error count is not updated and the bio can get requeued forever by the read callback. Fix it by increasing error_count before the handle_errors feature checking. Cc: stable@kernel.org Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Takahiro Yasui authored
In create_log_context function, dm_io_client_destroy function needs to be called, when memory allocation of disk_header, sync_bits and recovering_bits failed, but dm_io_client_destroy is not called. Cc: stable@kernel.org Signed-off-by: Takahiro Yasui <tyasui@redhat.com> Acked-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Change yield() to msleep(1). If the thread had realtime priority, yield() doesn't really yield, so the yielding process would loop indefinitely and cause machine lockup. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
Mikulas Patocka authored
Move one dm_table_put() so that the last reference in the thread gets dropped in __unbind(). This is required for a following patch, dm-table-rework-reference-counting.patch, which will change the logic in such a way that table destructor is called only at specific points in the code. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-
- 05 Jan, 2009 1 commit
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-currentLinus Torvalds authored
* 'audit.b61' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: audit: validate comparison operations, store them in sane form clean up audit_rule_{add,del} a bit make sure that filterkey of task,always rules is reported audit rules ordering, part 2 fixing audit rule ordering mess, part 1 audit_update_lsm_rules() misses the audit_inode_hash[] ones sanitize audit_log_capset() sanitize audit_fd_pair() sanitize audit_mq_open() sanitize AUDIT_MQ_SENDRECV sanitize audit_mq_notify() sanitize audit_mq_getsetattr() sanitize audit_ipc_set_perm() sanitize audit_ipc_obj() sanitize audit_socketcall don't reallocate buffer in every audit_sockaddr()
-
- 04 Jan, 2009 23 commits
-
-
Alessandro Zummo authored
Add standard interfaces for alarm/update irqs enabling. Drivers are no more required to implement equivalent ioctl code as rtc-dev will provide it. UIE emulation should now be handled correctly and will work even for those RTC drivers who cannot be configured to do both UIE and AIE. Signed-off-by: Alessandro Zummo <a.zummo@towertech.it> Cc: David Brownell <david-b@pacbell.net> Cc: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Nick Piggin authored
With the write_begin/write_end aops, page_symlink was broken because it could no longer pass a GFP_NOFS type mask into the point where the allocations happened. They are done in write_begin, which would always assume that the filesystem can be entered from reclaim. This bug could cause filesystem deadlocks. The funny thing with having a gfp_t mask there is that it doesn't really allow the caller to arbitrarily tinker with the context in which it can be called. It couldn't ever be GFP_ATOMIC, for example, because it needs to take the page lock. The only thing any callers care about is __GFP_FS anyway, so turn that into a single flag. Add a new flag for write_begin, AOP_FLAG_NOFS. Filesystems can now act on this flag in their write_begin function. Change __grab_cache_page to accept a nofs argument as well, to honour that flag (while we're there, change the name to grab_cache_page_write_begin which is more instructive and does away with random leading underscores). This is really a more flexible way to go in the end anyway -- if a filesystem happens to want any extra allocations aside from the pagecache ones in ints write_begin function, it may now use GFP_KERNEL (rather than GFP_NOFS) for common case allocations (eg. ocfs2_alloc_write_ctxt, for a random example). [kosaki.motohiro@jp.fujitsu.com: fix ubifs] [kosaki.motohiro@jp.fujitsu.com: fix fuse] Signed-off-by: Nick Piggin <npiggin@suse.de> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: <stable@kernel.org> [2.6.28.x] Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ Cleaned up the calling convention: just pass in the AOP flags untouched to the grab_cache_page_write_begin() function. That just simplifies everybody, and may even allow future expansion of the logic. - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Bruno Prémont authored
The function viafb_cursor() uses 2 stack-variables of CURSOR_SIZE bits; CURSOR_SIZE is defined as (8 * 1024). Using up twice 1k on stack is too much for 4k-stack (though it works with 8k-stacks). Make those two variables kzalloc'ed to preserve stack space. Also merge the whole lot of local struct's in viafb_ioctl into a union so the stack usage gets minimized here as well. (struct's are only accessed in their indicidual IOCTL case) This second part is only compile-tested as I know of no userspace app using the IOCTLs. Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org> Cc: <JosephChan@via.com.tw> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Pekka Enberg authored
As suggested by Andreas Dilger, introduce a bgl_lock_ptr() helper in <linux/blockgroup_lock.h> and add separate sb_bgl_lock() helpers to filesystem specific header files to break the hidden dependency to struct ext[234]_sb_info. Also, while at it, convert the macros to static inlines to try make up for all the times I broke Andrew Morton's tree. Acked-by: Andreas Dilger <adilger@sun.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Randy Dunlap authored
Include header files as used/needed: In file included from drivers/leds/leds-dac124s085.c:16: include/linux/spi/spi.h:66: error: field 'dev' has incomplete type include/linux/spi/spi.h: In function 'to_spi_device': include/linux/spi/spi.h:100: warning: type defaults to 'int' in declaration of '__mptr' ... Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Adam Lackorzynski authored
The flush_cache_vmap in vmap_page_range() is called with the end of the range twice. The following patch fixes this for me. Signed-off-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Li Zefan authored
The race is calling cgroup_clone() while umounting the ns cgroup subsys, and thus cgroup_clone() might access invalid cgroup_fs, or kill_sb() is called after cgroup_clone() created a new dir in it. The BUG I triggered is BUG_ON(root->number_of_cgroups != 1); ------------[ cut here ]------------ kernel BUG at kernel/cgroup.c:1093! invalid opcode: 0000 [#1] SMP ... Process umount (pid: 5177, ti=e411e000 task=e40c4670 task.ti=e411e000) ... Call Trace: [<c0493df7>] ? deactivate_super+0x3f/0x51 [<c04a3600>] ? mntput_no_expire+0xb3/0xdd [<c04a3ab2>] ? sys_umount+0x265/0x2ac [<c04a3b06>] ? sys_oldumount+0xd/0xf [<c0403911>] ? sysenter_do_call+0x12/0x31 ... EIP: [<c0456e76>] cgroup_kill_sb+0x23/0xe0 SS:ESP 0068:e411ef2c ---[ end trace c766c1be3bf944ac ]--- Cc: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: "Serge E. Hallyn" <serue@us.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Al Viro authored
Don't store the field->op in the messy (and very inconvenient for e.g. audit_comparator()) form; translate to dense set of values and do full validation of userland-submitted value while we are at it. ->audit_init_rule() and ->audit_match_rule() get new values now; in-tree instances updated. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Fix the actual rule listing; add per-type lists _not_ used for matching, with all exit,... sitting on one such list. Simplifies "do something for all rules" logics, while we are at it... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Problem: ordering between the rules on exit chain is currently lost; all watch and inode rules are listed after everything else _and_ exit,never on one kind doesn't stop exit,always on another from being matched. Solution: assign priorities to rules, keep track of the current highest-priority matching rule and its result (always/never). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
* no allocations * return void * don't duplicate checked for dummy context Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
* no allocations * return void Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
* don't bother with allocations * don't do double copy_from_user() * don't duplicate parts of check for audit_dummy_context() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
* logging the original value of *msg_prio in mq_timedreceive(2) is insane - the argument is write-only (i.e. syscall always ignores the original value and only overwrites it). * merge __audit_mq_timed{send,receive} * don't do copy_from_user() twice * don't mess with allocations in auditsc part * ... and don't bother checking !audit_enabled and !context in there - we'd already checked for audit_dummy_context(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
* don't copy_from_user() twice * don't bother with allocations * don't duplicate parts of audit_dummy_context() * make it return void Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
* get rid of allocations * make it return void * don't duplicate parts of audit_dummy_context() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
* get rid of allocations * make it return void * simplify callers Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
* get rid of allocations * make it return void * simplify callers Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
* don't bother with allocations * now that it can't fail, make it return void Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
No need to do that more than once per process lifetime; allocating/freeing on each sendto/accept/etc. is bloody pointless. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
- 03 Jan, 2009 3 commits
-
-
Linus Torvalds authored
Merge branch 'cpus4096-for-linus-3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'cpus4096-for-linus-3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (77 commits) x86: setup_per_cpu_areas() cleanup cpumask: fix compile error when CONFIG_NR_CPUS is not defined cpumask: use alloc_cpumask_var_node where appropriate cpumask: convert shared_cpu_map in acpi_processor* structs to cpumask_var_t x86: use cpumask_var_t in acpi/boot.c x86: cleanup some remaining usages of NR_CPUS where s/b nr_cpu_ids sched: put back some stack hog changes that were undone in kernel/sched.c x86: enable cpus display of kernel_max and offlined cpus ia64: cpumask fix for is_affinity_mask_valid() cpumask: convert RCU implementations, fix xtensa: define __fls mn10300: define __fls m32r: define __fls h8300: define __fls frv: define __fls cris: define __fls cpumask: CONFIG_DISABLE_OBSOLETE_CPUMASK_FUNCTIONS cpumask: zero extra bits in alloc_cpumask_var_node cpumask: replace for_each_cpu_mask_nr with for_each_cpu in kernel/time/ cpumask: convert mm/ ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommuLinus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu: (89 commits) AMD IOMMU: remove now unnecessary #ifdefs AMD IOMMU: prealloc_protection_domains should be static kvm/iommu: fix compile warning AMD IOMMU: add statistics about total number of map requests AMD IOMMU: add statistics about allocated io memory AMD IOMMU: add stats counter for domain tlb flushes AMD IOMMU: add stats counter for single iommu domain tlb flushes AMD IOMMU: add stats counter for cross-page request AMD IOMMU: add stats counter for free_coherent requests AMD IOMMU: add stats counter for alloc_coherent requests AMD IOMMU: add stats counter for unmap_sg requests AMD IOMMU: add stats counter for map_sg requests AMD IOMMU: add stats counter for unmap_single requests AMD IOMMU: add stats counter for map_single requests AMD IOMMU: add stats counter for completion wait events AMD IOMMU: add init code for statistic collection AMD IOMMU: add necessary header defines for stats counting AMD IOMMU: add Kconfig entry for statistic collection code AMD IOMMU: use dev_name in iommu_enable function AMD IOMMU: use calc_devid in prealloc_protection_domains ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6Linus Torvalds authored
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (34 commits) V4L/DVB (10173): Missing v4l2_prio_close in radio_release V4L/DVB (10172): add DVB_DEVICE_TYPE= to uevent V4L/DVB (10171): Use usb_set_intfdata V4L/DVB (10170): tuner-simple: prevent possible OOPS caused by divide by zero error V4L/DVB (10168): sms1xxx: fix inverted gpio for lna control on tiger r2 V4L/DVB (10167): sms1xxx: add support for inverted gpio V4L/DVB (10166): dvb frontend: stop using non-C99 compliant comments V4L/DVB (10165): Add FE_CAN_2G_MODULATION flag to frontends that support DVB-S2 V4L/DVB (10164): Add missing S2 caps flag to S2API V4L/DVB (10163): em28xx: allocate adev together with struct em28xx dev V4L/DVB (10162): tuner-simple: Fix tuner type set message V4L/DVB (10161): saa7134: fix autodetection for AVer TV GO 007 FM Plus V4L/DVB (10160): em28xx: update chip id for em2710 V4L/DVB (10157): Add USB ID for the Sil4701 radio from DealExtreme V4L/DVB (10156): saa7134: Add support for Avermedia AVer TV GO 007 FM Plus V4L/DVB (10155): Add TEA5764 radio driver V4L/DVB (10154): saa7134: fix a merge conflict on Behold H6 board V4L/DVB (10153): Add the Beholder H6 card to DVB-T part of sources. V4L/DVB (10152): Change configuration of the Beholder H6 card V4L/DVB (10151): Fix I2C bridge error in zl10353 ...
-