- 11 Nov, 2013 2 commits
-
-
Joe Thornber authored
"Passthrough" is a dm-cache operating mode (like writethrough or writeback) which is intended to be used when the cache contents are not known to be coherent with the origin device. It behaves as follows: * All reads are served from the origin device (all reads miss the cache) * All writes are forwarded to the origin device; additionally, write hits cause cache block invalidates This mode decouples cache coherency checks from cache device creation, largely to avoid having to perform coherency checks while booting. Boot scripts can create cache devices in passthrough mode and put them into service (mount cached filesystems, for example) without having to worry about coherency. Coherency that exists is maintained, although the cache will gradually cool as writes take place. Later, applications can perform coherency checks, the nature of which will depend on the type of the underlying storage. If coherency can be verified, the cache device can be transitioned to writethrough or writeback mode while still warm; otherwise, the cache contents can be discarded prior to transitioning to the desired operating mode. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Morgan Mears <Morgan.Mears@netapp.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Joe Thornber authored
Allow a cache to shrink if the blocks being removed from the cache are not dirty. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
- 09 Nov, 2013 21 commits
-
-
Joe Thornber authored
If a write block triggers promotion and covers a whole block we can avoid a copy. Introduce dm_{hook,unhook}_bio to simplify saving and restoring bio fields (bi_private is now used by overwrite). Switch writethrough support over to using these helpers too. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Joe Thornber authored
Previously these promotions only got priority if there were unused cache blocks. Now we give them priority if there are any clean blocks in the cache. The fio_soak_test in the device-mapper-test-suite now gives uniform performance across subvolumes (~16 seconds). Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Joe Thornber authored
There are now two multiqueues for in cache blocks. A clean one and a dirty one. writeback_work comes from the dirty one. Demotions come from the clean one. There are two benefits: - Performance improvement, since demoting a clean block is a noop. - The cache cleans itself when io load is light. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Heinz Mauelshagen authored
Check commit_requested flag _before_ calling dm_cache_changed_this_transaction() superfluously. Also, be sure to set last_commit_jiffies _after_ dm_cache_commit() completes. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Joe Thornber authored
Don't waste time spotting blocks that have been allocated and then freed in the same transaction. The extra lookup is expensive, and I don't think it really gives us much. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Mike Snitzer authored
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Mikulas Patocka authored
The option DM_LOG_USERSPACE is sub-option of DM_MIRROR, so place it right after DM_MIRROR. Doing so fixes various other Device mapper targets/features to be properly nested under "Device mapper support". Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Mikulas Patocka authored
This patch allows the removal of an open device to be deferred until it is closed. (Previously such a removal attempt would fail.) The deferred remove functionality is enabled by setting the flag DM_DEFERRED_REMOVE in the ioctl structure on DM_DEV_REMOVE or DM_REMOVE_ALL ioctl. On return from DM_DEV_REMOVE, the flag DM_DEFERRED_REMOVE indicates if the device was removed immediately or flagged to be removed on close - if the flag is clear, the device was removed. On return from DM_DEV_STATUS and other ioctls, the flag DM_DEFERRED_REMOVE is set if the device is scheduled to be removed on closure. A device that is scheduled to be deleted can be revived using the message "@cancel_deferred_remove". This message clears the DMF_DEFERRED_REMOVE flag so that the device won't be deleted on close. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Mike Snitzer authored
If preresume fails it is worth logging an error given that a device is left suspended due to the failure. This change was motivated by local preresume error logging that was added to the cache target ("preresume failed"). Elevating this target-agnostic context for the where the target-specific error occurred relative to the DM core's callouts makes sense. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Joe Thornber <ejt@redhat.com>
-
Milan Broz authored
dm-crypt can already activate TCRYPT (TrueCrypt compatible) containers in LRW or XTS block encryption mode. TCRYPT containers prior to version 4.1 use CBC mode with some additional tweaks, this patch adds support for these containers. This new mode is implemented using special IV generator named TCW (TrueCrypt IV with whitening). TCW IV only supports containers that are encrypted with one cipher (Tested with AES, Twofish, Serpent, CAST5 and TripleDES). While this mode is legacy and is known to be vulnerable to some watermarking attacks (e.g. revealing of hidden disk existence) it can still be useful to activate old containers without using 3rd party software or for independent forensic analysis of such containers. (Both the userspace and kernel code is an independent implementation based on the format documentation and it completely avoids use of original source code.) The TCW IV generator uses two additional keys: Kw (whitening seed, size is always 16 bytes - TCW_WHITENING_SIZE) and Kiv (IV seed, size is always the IV size of the selected cipher). These keys are concatenated at the end of the main encryption key provided in mapping table. While whitening is completely independent from IV, it is implemented inside IV generator for simplification. The whitening value is always 16 bytes long and is calculated per sector from provided Kw as initial seed, xored with sector number and mixed with CRC32 algorithm. Resulting value is xored with ciphertext sector content. IV is calculated from the provided Kiv as initial IV seed and xored with sector number. Detailed calculation can be found in the Truecrypt documentation for version < 4.1 and will also be described on dm-crypt site, see: http://code.google.com/p/cryptsetup/wiki/DMCrypt The experimental support for activation of these containers is already present in git devel brach of cryptsetup. Signed-off-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Milan Broz authored
Some encryption modes use extra keys (e.g. loopAES has IV seed) which are not used in block cipher initialization but are part of key string in table constructor. This patch adds an additional field which describes the length of the extra key(s) and substracts it before real key encryption setting. The key_size always includes the size, in bytes, of the key provided in mapping table. The key_parts describes how many parts (usually keys) are contained in the whole key buffer. And key_extra_size contains size in bytes of additional keys part (this number of bytes must be subtracted because it is processed by the IV generator). | K1 | K2 | .... | K64 | Kiv | |----------- key_size ----------------- | | |-key_extra_size-| | [64 keys] | [1 key] | => key_parts = 65 Example where key string contains main key K, whitening key Kw and IV seed Kiv: | K | Kiv | Kw | |--------------- key_size --------------| | |-----key_extra_size------| | [1 key] | [1 key] | [1 key] | => key_parts = 3 Because key_extra_size is calculated during IV mode setting, key initialization is moved after this step. For now, this change has no effect to supported modes (thanks to ilog2 rounding) but it is required by the following patch. Also, fix a sparse warning in crypt_iv_lmk_one(). Signed-off-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Heinz Mauelshagen authored
A migration failure should be logged (albeit limited). Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Heinz Mauelshagen authored
Fix a few cell_defer() calls that weren't passing a bool. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Mikulas Patocka authored
Return -EINVAL when the specified cache policy is unknown rather than returning -ENOMEM. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Joe Thornber authored
Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Joe Thornber authored
Rename takeout_queue to concat_queue. Fix a harmless bug in mq policies pop() function. Currently pop() always succeeds, with up coming changes this wont be the case. Fix typo in comment above pre_cache_to_cache prototype. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Joe Thornber authored
No need to return from a void function. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Joe Thornber authored
Make the quiescing flag an atomic_t and stop protecting it with a spin lock. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Joe Thornber authored
The code that was trying to do this was inadequate. The postsuspend method (in ioctl context), needs to wait for the worker thread to acknowledge the request to quiesce. Otherwise the migration count may drop to zero temporarily before the worker thread realises we're quiescing. In this case the target will be taken down, but the worker thread may have issued a new migration, which will cause an oops when it completes. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org # 3.9+
-
Joe Thornber authored
Previously only origin bios could trigger ticks, which meant if all the io was destined for the cache no ticks were generated. If no ticks are generated then multiple hits, and movements in general, are attributed to the same tick. Only a stop gap fix, we need a better solution. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Joe Thornber authored
It is safe to use a mutex in mq_residency() at this point since it is only called from ioctl context. But future-proof mq_residency() by using might_sleep() to catch new contexts that cannot sleep. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
- 05 Nov, 2013 2 commits
-
-
Joe Thornber authored
Entries would be lost if the old tail block was partially filled. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org # 3.9+
-
Hannes Reinecke authored
When pg_init is running no I/O can be submitted to the underlying devices, as the path priority etc might change. When using queue_io for this, requests will be piling up within multipath as the block I/O scheduler just sees a _very fast_ device. All of this queued I/O has to be resubmitted from within multipathing once pg_init is done. This approach has the problem that it's virtually impossible to abort I/O when pg_init is running, and we're adding heavy load to the devices after pg_init since all of the queued I/O needs to be resubmitted _before_ any requests can be pulled off of the request queue and normal operation continues. This patch will requeue the I/O that triggers the pg_init call, and return 'busy' when pg_init is in progress. With these changes the block I/O scheduler will stop submitting I/O during pg_init, resulting in a quicker path switch and less I/O pressure (and memory consumption) after pg_init. Signed-off-by: Hannes Reinecke <hare@suse.de> [patch header edited for clarity and typos by Mike Snitzer] Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
- 01 Nov, 2013 1 commit
-
-
Shiva Krishna Merla authored
Whenever multipath_dtr() is happening we must prevent queueing any further path activation work. Implement this by adding a new 'pg_init_disabled' flag to the multipath structure that denotes future path activation work should be skipped if it is set. By disabling pg_init and then re-enabling in flush_multipath_work() we also avoid the potential for pg_init to be initiated while suspending an mpath device. Without this patch a race condition exists that may result in a kernel panic: 1) If after pg_init_done() decrements pg_init_in_progress to 0, a call to wait_for_pg_init_completion() assumes there are no more pending path management commands. 2) If pg_init_required is set by pg_init_done(), due to retryable mode_select errors, then process_queued_ios() will again queue the path activation work. 3) If free_multipath() completes before activate_path() work is called a NULL pointer dereference like the following can be seen when accessing members of the recently destructed multipath: BUG: unable to handle kernel NULL pointer dereference at 0000000000000090 RIP: 0010:[<ffffffffa003db1b>] [<ffffffffa003db1b>] activate_path+0x1b/0x30 [dm_multipath] [<ffffffff81090ac0>] worker_thread+0x170/0x2a0 [<ffffffff81096c80>] ? autoremove_wake_function+0x0/0x40 [switch to disabling pg_init in flush_multipath_work & header edits by Mike Snitzer] Signed-off-by: Shiva Krishna Merla <shivakrishna.merla@netapp.com> Reviewed-by: Krishnasamy Somasundaram <somasundaram.krishnasamy@netapp.com> Tested-by: Speagle Andy <Andy.Speagle@netapp.com> Acked-by: Junichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org
-
- 31 Oct, 2013 1 commit
-
-
Mikulas Patocka authored
dm-mpath and dm-thin must process messages even if some device is suspended, so we allocate argv buffer with GFP_NOIO. These messages have a small fixed number of arguments. On the other hand, dm-switch needs to process bulk data using messages so excessive use of GFP_NOIO could cause trouble. The patch also lowers the default number of arguments from 64 to 8, so that there is smaller load on GFP_NOIO allocations. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Acked-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
- 13 Oct, 2013 13 commits
-
-
Linus Torvalds authored
-
git://www.linux-watchdog.org/linux-watchdogLinus Torvalds authored
Pull watchdog fixes from Wim Van Sebroeck: "This will fix a deadlock on the ts72xx_wdt driver, fix bitmasks in the kempld_wdt driver and fix a section mismatch in the sunxi_wdt driver" * git://www.linux-watchdog.org/linux-watchdog: watchdog: sunxi: Fix section mismatch watchdog: kempld_wdt: Fix bit mask definition watchdog: ts72xx_wdt: locking bug in ioctl
-
Maxime Ripard authored
This driver has a section mismatch, for probe and remove functions, leading to the following warning during the compilation. WARNING: drivers/watchdog/built-in.o(.data+0x24): Section mismatch in reference from the variable sunxi_wdt_driver to the function .init.text:sunxi_wdt_probe() The variable sunxi_wdt_driver references the function __init sunxi_wdt_probe() Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
-
Jingoo Han authored
STAGE_CFG bits are defined as [5:4] bits. However, '(((x) & 0x30) << 4)' handles [9:8] bits. Thus, it should be fixed in order to handle [5:4] bits. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
-
Dan Carpenter authored
Calling the WDIOC_GETSTATUS & WDIOC_GETBOOTSTATUS and twice will cause a interruptible deadlock. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
-
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-socLinus Torvalds authored
Pull ARM SoC fixes from Olof Johansson: "A small batch of fixes this week, mostly OMAP related. Nothing stands out as particularly controversial. Also a fix for a 3.12-rc1 timer regression for Exynos platforms, including the Chromebooks" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: exynos: dts: Update 5250 arch timer node with clock frequency ARM: OMAP2: RX-51: Add missing max_current to rx51_lp5523_led_config ARM: mach-omap2: board-generic: fix undefined symbol ARM: dts: Fix pinctrl mask for omap3 ARM: OMAP3: Fix hardware detection for omap3630 when booted with device tree ARM: OMAP2: gpmc-onenand: fix sync mode setup with DT
-
Yuvaraj Kumar C D authored
Without the "clock-frequency" property in arch timer node, could able to see the below crash dump. [<c0014e28>] (unwind_backtrace+0x0/0xf4) from [<c0011808>] (show_stack+0x10/0x14) [<c0011808>] (show_stack+0x10/0x14) from [<c036ac1c>] (dump_stack+0x7c/0xb0) [<c036ac1c>] (dump_stack+0x7c/0xb0) from [<c01ab760>] (Ldiv0_64+0x8/0x18) [<c01ab760>] (Ldiv0_64+0x8/0x18) from [<c0062f60>] (clockevents_config.part.2+0x1c/0x74) [<c0062f60>] (clockevents_config.part.2+0x1c/0x74) from [<c0062fd8>] (clockevents_config_and_register+0x20/0x2c) [<c0062fd8>] (clockevents_config_and_register+0x20/0x2c) from [<c02b8e8c>] (arch_timer_setup+0xa8/0x134) [<c02b8e8c>] (arch_timer_setup+0xa8/0x134) from [<c04b47b4>] (arch_timer_init+0x1f4/0x24c) [<c04b47b4>] (arch_timer_init+0x1f4/0x24c) from [<c04b40d8>] (clocksource_of_init+0x34/0x58) [<c04b40d8>] (clocksource_of_init+0x34/0x58) from [<c049ed8c>] (time_init+0x20/0x2c) [<c049ed8c>] (time_init+0x20/0x2c) from [<c049b95c>] (start_kernel+0x1e0/0x39c) THis is because the Exynos u-boot, for example on the Chromebooks, doesn't set up the CNTFRQ register as expected by arch_timer. Instead, we have to specify the frequency in the device tree like this. Signed-off-by: Yuvaraj Kumar C D <yuvaraj.cd@samsung.com> [olof: Changed subject, added comment, elaborated on commit message] Signed-off-by: Olof Johansson <olof@lixom.net>
-
Olof Johansson authored
Merge tag 'fixes-against-v3.12-rc3-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes From Tony Lindgren: Few fixes for omap3 related hangs and errors that people have noticed now that people are actually using the device tree based booting for omap3. Also one regression fix for timer compile for dra7xx when omap5 is not selected, and a LED regression fix for n900. * tag 'fixes-against-v3.12-rc3-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: OMAP2: RX-51: Add missing max_current to rx51_lp5523_led_config ARM: mach-omap2: board-generic: fix undefined symbol ARM: dts: Fix pinctrl mask for omap3 ARM: OMAP3: Fix hardware detection for omap3630 when booted with device tree ARM: OMAP2: gpmc-onenand: fix sync mode setup with DT Signed-off-by: Olof Johansson <olof@lixom.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linuxLinus Torvalds authored
Pull parisc fixes from Helge Deller: "This patchset includes a bugfix to prevent a kernel crash when memory in page zero is accessed by the kernel itself, e.g. via probe_kernel_read(). Furthermore we now export flush_cache_page() which is needed (indirectly) by the lustre filesystem. The other patches remove unused functions and optimizes the page fault handler to only evaluate variables if needed, which again protects against possible kernel crashes" * 'parisc-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: let probe_kernel_read() capture access to page zero parisc: optimize variable initialization in do_page_fault parisc: fix interruption handler to respect pagefault_disable() parisc: mark parisc_terminate() noreturn and cold. parisc: remove unused syscall_ipi() function. parisc: kill SMP single function call interrupt parisc: Export flush_cache_page() (needed by lustre)
-
git://git.infradead.org/users/vkoul/slave-dmaLinus Torvalds authored
Pull slave-dmaengine fixes from Vinod Koul: "Another week, time to send another fixes request taking time out of extended weekend for the festivities in this part of the world. We have two fixes from Sergei for rcar driver and one fixing memory leak of edma driver by Geyslan" * 'fixes' of git://git.infradead.org/users/vkoul/slave-dma: dma: edma.c: remove edma_desc leakage rcar-hpbdma: add parameter to set_slave() method rcar-hpbdma: remove shdma_free_irq() calls
-
Helge Deller authored
Signed-off-by: Helge Deller <deller@gmx.de>
-
John David Anglin authored
The attached change defers the initialization of the variables tsk, mm and flags until they are needed. As a result, the code won't crash if a kernel probe is done with a corrupt context and the code will be better optimized. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
-
Helge Deller authored
Running an "echo t > /proc/sysrq-trigger" crashes the parisc kernel. The problem is, that in print_worker_info() we try to read the workqueue info via the probe_kernel_read() functions which use pagefault_disable() to avoid crashes like this: probe_kernel_read(&pwq, &worker->current_pwq, sizeof(pwq)); probe_kernel_read(&wq, &pwq->wq, sizeof(wq)); probe_kernel_read(name, wq->name, sizeof(name) - 1); The problem here is, that the first probe_kernel_read(&pwq) might return zero in pwq and as such the following probe_kernel_reads() try to access contents of the page zero which is read protected and generate a kernel segfault. With this patch we fix the interruption handler to call parisc_terminate() directly only if pagefault_disable() was not called (in which case preempt_count()==0). Otherwise we hand over to the pagefault handler which will try to look up the faulting address in the fixup tables. Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v3.0+ Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
-