- 28 Feb, 2017 1 commit
-
-
Mike Snitzer authored
While cleaning up awkward branching in raid_message() a raid set "check" regression was introduced because "check" needs both MD_RECOVERY_SYNC and MD_RECOVERY_REQUESTED flags set. Fix this regression by explicitly setting both flags for the "check" case (like is also done for the "repair" case, but redundant set_bit()s are perfectly fine because it adds clarity to what is needed in response to both messages -- in addition this isn't fast path code). Fixes: 105db599 ("dm raid: cleanup awkward branching in raid_message() option processing") Reported-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
- 17 Feb, 2017 2 commits
-
-
Mikulas Patocka authored
Commit df2cb6da ("block: Avoid deadlocks with bio allocation by stacking drivers") created a workqueue for every bio set and code in bio_alloc_bioset() that tries to resolve some low-memory deadlocks by redirecting bios queued on current->bio_list to the workqueue if the system is low on memory. However other deadlocks (see below **) may happen, without any low memory condition, because generic_make_request is queuing bios to current->bio_list (rather than submitting them). ** the related dm-snapshot deadlock is detailed here: https://www.redhat.com/archives/dm-devel/2016-July/msg00065.html Fix this deadlock by redirecting any bios on current->bio_list to the bio_set's rescue workqueue on every schedule() call. Consequently, when the process blocks on a mutex, the bios queued on current->bio_list are dispatched to independent workqueus and they can complete without waiting for the mutex to be available. The structure blk_plug contains an entry cb_list and this list can contain arbitrary callback functions that are called when the process blocks. To implement this fix DM (ab)uses the onstack plug's cb_list interface to get its flush_current_bio_list() called at schedule() time. This fixes the snapshot deadlock - if the map method blocks, flush_current_bio_list() will be called and it redirects bios waiting on current->bio_list to appropriate workqueues. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1267650 Depends-on: df2cb6da ("block: Avoid deadlocks with bio allocation by stacking drivers") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Mike Snitzer authored
The sloppy nature of lockless access to percpu pointers (s->current_path) in rr_select_path(), from multiple threads, is causing some paths to used more than others -- which results in less IO performance being observed. Revert these upstream commits to restore truly symmetric round-robin IO submission in DM multipath: b0b477c7 dm round robin: use percpu 'repeat_count' and 'current_path' 802934b2 dm round robin: do not use this_cpu_ptr() without having preemption disabled There is no benefit to all this complexity if repeat_count = 1 (which is the recommended default). Cc: stable@vger.kernel.org # 4.6+ Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
- 16 Feb, 2017 14 commits
-
-
Mikulas Patocka authored
Fixes: dfcfac3e ("dm stats: collect and report histogram of IO latencies") Cc: stable@vger.kernel.org # v4.2+ Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Bhumika Goyal authored
Declare dm_space_map structures as const as they are only passed as an argument to the function memcpy. This argument is of type const void *, so dm_space_map structures having this property can be declared as const. File size before: text data bss dec hex filename 4889 240 0 5129 1409 dm-space-map-metadata.o File size after: text data bss dec hex filename 5139 0 0 5139 1413 dm-space-map-metadata.o Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Mike Snitzer authored
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Joe Thornber authored
Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Joe Thornber authored
Big speed up with large configs. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Joe Thornber authored
A more efficient way of creating a populated bitset. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Mike Snitzer authored
Improves __load_mapping_v1() and __load_mapping_v2() DMERR messages to explicitly name the cache block number whose mapping couldn't be loaded. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Joe Thornber authored
If "metadata2" is provided as a table argument when creating/loading a cache target a more compact metadata format, with separate dirty bits, is used. "metadata2" improves speed of shutting down a cache target. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Joe Thornber authored
Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Joe Thornber authored
Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Joe Thornber authored
dm_btree_del() is called from an ioctl so don't recurse into FS. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Joe Thornber authored
The metadata_space_map_root passed to sm_ll_open_metadata() may or may not be arch aligned, use memcpy to ensure it is. This is not a fast path so the extra memcpy doesn't hurt us. Long-term it'd be better to use the kernel's alignment infrastructure to remove the memcpy()s that are littered across persistent-data (btree, array, space-maps, etc). Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Joe Thornber authored
Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Joe Thornber authored
A rounding bug due to compiler generated temporary being 32bit was found in remap_to_cache(). A localized cast in remap_to_cache() fixes the corruption but this preferred fix (changing from uint32_t to sector_t) eliminates potential for future rounding errors elsewhere. Cc: stable@vger.kernel.org Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
- 25 Jan, 2017 6 commits
-
-
Mike Snitzer authored
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Heinz Mauelshagen authored
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Heinz Mauelshagen authored
For consistency, call read_disk_sb() from attempt_restore_of_faulty_devices() instead of calling sync_page_io() directly. Explicitly set device to faulty on superblock read error. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Heinz Mauelshagen authored
Add md raid4/5/6 journaling support (upstream commit bac624f3 started the implementation) which closes the write hole (i.e. non-atomic updates to stripes) using a dedicated journal device. Background: raid4/5/6 stripes hold N data payloads per stripe plus one parity raid4/5 or two raid6 P/Q syndrome payloads in an in-memory stripe cache. Parity or P/Q syndromes used to recover any data payloads in case of a disk failure are calculated from the N data payloads and need to be updated on the different component devices of the raid device. Those are non-atomic, persistent updates. Hence a crash can cause failure to update all stripe payloads persistently and thus cause data loss during stripe recovery. This problem gets addressed by writing whole stripe cache entries (together with journal metadata) to a persistent journal entry on a dedicated journal device. Only if that journal entry is written successfully, the stripe cache entry is updated on the component devices of the raid device (i.e. writethrough type). In case of a crash, the entry can be recovered from the journal and be written again thus ensuring consistent stripe payload suitable to data recovery. Future dependencies: once writeback caching being worked on to compensate for the throughput implictions involved with writethrough overhead is supported with journaling in upstream, an additional patch based on this one will support it in dm-raid. Journal resilience related remarks: because stripes are recovered from the journal in case of a crash, the journal device better be resilient. Resilience becomes mandatory with future writeback support, because loosing the working set in the log means data loss as oposed to writethrough, were the loss of the journal device 'only' reintroduces the write hole. Fix comment on data offsets in parse_dev_params() and initialize new_data_offset as well. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Heinz Mauelshagen authored
During raid set resize checks and setting up the recovery offset in case a raid set grows, calculated rd->md.dev_sectors is compared to rs->dev[0].rdev.sectors. Device 0 may not be defined in case userspace passes in '- -' for it (lvm2 doesn't do that so far), thus it's device sectors can't be taken authoritatively in this comparison and another valid device must be used to retrieve the device size. Use mddev->dev_sectors in checking for ongoing recovery for the same reason. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
Heinz Mauelshagen authored
This fix addresses the following 3 failure scenarios: 1) If a (transiently) inaccessible metadata device is being passed into the constructor (e.g. a device tuple '254:4 254:5'), it is processed as if '- -' was given. This erroneously results in a status table line containing '- -', which mistakenly differs from what has been passed in. As a result, userspace libdevmapper puts the device tuple seperate from the RAID device thus not processing the dependencies properly. 2) False health status char 'A' instead of 'D' is emitted on the status status info line for the meta/data device tuple in this metadata device failure case. 3) If the metadata device is accessible when passed into the constructor but the data device (partially) isn't, that leg may be set faulty by the raid personality on access to the (partially) unavailable leg. Restore tried in a second raid device resume on such failed leg (status char 'D') fails after the (partial) leg returned. Fixes for aforementioned failure scenarios: - don't release passed in devices in the constructor thus allowing the status table line to e.g. contain '254:4 254:5' rather than '- -' - emit device status char 'D' rather than 'A' for the device tuple with the failed metadata device on the status info line - when attempting to restore faulty devices in a second resume, allow the device hot remove function to succeed by setting the device to not in-sync In case userspace intentionally passes '- -' into the constructor to avoid that device tuple (e.g. to split off a raid1 leg temporarily for later re-addition), the status table line will correctly show '- -' and the status info line will provide a '-' device health character for the non-defined device tuple. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-
- 22 Jan, 2017 10 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fix from Thomas Gleixner: "Restore the retrigger callbacks in the IO APIC irq chips. That addresses a long standing regression which got introduced with the rewrite of the x86 irq subsystem two years ago and went unnoticed so far" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ioapic: Restore IO-APIC irq_chip retrigger callback
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull smp/hotplug fix from Thomas Gleixner: "Remove an unused variable which is a leftover from the notifier removal" * 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: cpu/hotplug: Remove unused but set variable in _cpu_down()
-
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds authored
Pull virtio/vhost fixes from Michael Tsirkin: "Random fixes and cleanups that accumulated over the time" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio/s390: virtio: constify virtio_config_ops structures virtio/s390: add missing \n to end of dev_err message virtio/s390: support READ_STATUS command for virtio-ccw tools/virtio/ringtest: tweaks for s390 tools/virtio/ringtest: fix run-on-all.sh for offline cpus virtio_console: fix a crash in config_work_handler vhost/scsi: silence uninitialized variable warning vhost: scsi: constify target_core_fabric_ops structures
-
git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linuxLinus Torvalds authored
Pull thermal management fixes from Zhang Rui: - fix a regression that thermal zone dynamically allocated sysfs attributes are freed before they're removed, which is introduced in 4.10-rc1 (Jacob von Chorus) - fix a boot warning because deprecated hwmon API is used (Fabio Estevam) - a couple of fixes for rockchip thermal driver (Brian Norris, Caesar Wang) * 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: thermal: rockchip: fixes the conversion table thermal: core: move tz->device.groups cleanup to thermal_release thermal: thermal_hwmon: Convert to hwmon_device_register_with_info() thermal: rockchip: handle set_trips without the trip points thermal: rockchip: optimize the conversion table thermal: rockchip: fixes invalid temperature case thermal: rockchip: don't pass table structs by value thermal: rockchip: improve conversion error messages
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usbLinus Torvalds authored
Pull USB fixes from Greg KH: "Here are a few small USB fixes for 4.10-rc5. Most of these are gadget/dwc2 fixes for reported issues, all of these have been in linux-next for a while. The last one is a single xhci WARN_ON removal to handle an issue that the dwc3 driver is hitting in the 4.10-rc tree. The warning is harmless and needs to be removed, and a "real" fix that is more complex will show up in 4.11-rc1 for this device. That last patch hasn't been in linux-next yet due to the weekend timing, but it's a "simple" WARN_ON() removal so what could go wrong? :)" Famous last words. * tag 'usb-4.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: xhci: remove WARN_ON if dma mask is not set for platform devices usb: dwc2: host: fix Wmaybe-uninitialized warning usb: dwc2: gadget: Fix GUSBCFG.USBTRDTIM value usb: gadget: udc: atmel: remove memory leak usb: dwc3: exynos fix axius clock error path to do cleanup usb: dwc2: Avoid suspending if we're in gadget mode usb: dwc2: use u32 for DT binding parameters usb: gadget: f_fs: Fix iterations on endpoints. usb: dwc2: gadget: Fix DMA memory freeing usb: gadget: composite: Fix function used to free memory
-
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimmLinus Torvalds authored
Pull libnvdimm fixes from Dan Williams: "Two fixes: - a regression fix for the multiple-pmem-namespace-per-region support added in 4.9. Even if an existing environment is not using that feature the act of creating and a destroying a single namespace with the ndctl utility will lead to the proliferation of extra unwanted namespace devices. - a fix for the error code returned from the pmem driver when the memcpy_mcsafe() routine returns -EFAULT. Btrfs seems to be the only block I/O consumer that tries to parse the meaning of the error code when it is non-zero. Neither of these fixes are critical, the namespace leak is awkward in that it can cause device naming to change and complicates debugging namespace initialization issues. The error code fix is included out of caution for what other consumers might be expecting -EIO for block I/O errors" * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: libnvdimm, namespace: fix pmem namespace leak, delete when size set to zero pmem: return EIO on read_pmem() failure
-
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linuxLinus Torvalds authored
Pull clk fix from Stephen Boyd: "One fix for Samsung Exynos524x SoCs where recent IOMMU patches have caused some of these clocks to turn off when they were always left on before" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk/samsung: exynos542x: mark some clocks as critical
-
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arcLinus Torvalds authored
Pull ARC fixes from Vineet Gupta: - more intc updates [Yuriv] - fix module build when unwinder is turned off - IO Coherency Programming model updates - other miscellaneous * tag 'arc-4.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: Revert "ARC: mm: IOC: Don't enable IOC by default" ARC: mm: split arc_cache_init to allow __init reaping of bulk ARCv2: IOC: Use actual memory size to setup aperture size ARCv2: IOC: Adhere to progamming model guidelines to avoid DMA corruption ARCv2: IOC: refactor the IOC and SLC operations into own functions ARC: module: Fix !CONFIG_ARC_DW2_UNWIND builds ARCv2: save r30 on kernel entry as gcc uses it for code-gen ARCv2: IRQ: Call entry/exit functions for chained handlers in MCIP ARC: IRQ: Use hwirq instead of virq in mask/unmask ARC: mmu: clarify the MMUv3 programming model
-
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds authored
Pull powerpc fixes from Michael Ellerman: "Two fixes for fallout from the hugetlb changes we merged this cycle. Ten other fixes, four only affect Power9, and the rest are a bit of a mixture though nothing terrible. Thanks to: Aneesh Kumar K.V, Anton Blanchard, Benjamin Herrenschmidt, Dave Martin, Gavin Shan, Madhavan Srinivasan, Nicholas Piggin, Reza Arbab" * tag 'powerpc-4.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc: Ignore reserved field in DCSR and PVR reads and writes powerpc/ptrace: Preserve previous TM fprs/vsrs on short regset write powerpc/ptrace: Preserve previous fprs/vsrs on short regset write powerpc/perf: Use MSR to report privilege level on P9 DD1 selftest/powerpc: Wrong PMC initialized in pmc56_overflow test powerpc/eeh: Enable IO path on permanent error powerpc/perf: Fix PM_BRU_CMPL event code for power9 powerpc/mm: Fix little-endian 4K hugetlb powerpc/mm/hugetlb: Don't panic when we don't find the default huge page size powerpc: Fix pgtable pmd cache init powerpc/icp-opal: Fix missing KVM case and harden replay powerpc/mm: Fix memory hotplug BUG() on radix
-
- 20 Jan, 2017 7 commits
-
-
git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds authored
Pull KVM fixes from Radim Krčmář: "ARM: - Fix for timer setup on VHE machines - Drop spurious warning when the timer races against the vcpu running again - Prevent a vgic deadlock when the initialization fails (for stable) s390: - Fix a kernel memory exposure (for stable) x86: - Fix exception injection when hypercall instruction cannot be patched" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: s390: do not expose random data via facility bitmap KVM: x86: fix fixing of hypercalls KVM: arm/arm64: vgic: Fix deadlock on error handling KVM: arm64: Access CNTHCTL_EL2 bit fields correctly on VHE systems KVM: arm/arm64: Fix occasional warning from the timer work function
-
Linus Torvalds authored
Merge branch 'scsi-target-for-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/bvanassche/linux Pull SCSI target fixes from Bart Van Assche: - two small fixes for the ibmvscsis driver - ten patches with bug fixes for the target mode of the qla2xxx driver - four patches that avoid that the "sparse" and "smatch" static analyzer tools report false positives for the qla2xxx code base * 'scsi-target-for-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/bvanassche/linux: qla2xxx: Disable out-of-order processing by default in firmware qla2xxx: Fix erroneous invalid handle message qla2xxx: Reduce exess wait during chip reset qla2xxx: Terminate exchange if corrupted qla2xxx: Fix crash due to null pointer access qla2xxx: Collect additional information to debug fw dump qla2xxx: Reset reserved field in firmware options to 0 qla2xxx: Set tcm_qla2xxx version to automatically track qla2xxx version qla2xxx: Include ATIO queue in firmware dump when in target mode qla2xxx: Fix wrong IOCB type assumption qla2xxx: Avoid that building with W=1 triggers complaints about set-but-not-used variables qla2xxx: Move two arrays from header files to .c files qla2xxx: Declare an array with file scope static qla2xxx: Fix indentation ibmvscsis: Fix sleeping in interrupt context ibmvscsis: Fix max transfer length
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull block fixes from Jens Axboe: "Just two small fixes for this -rc. One is just killing an unused variable from Keith, but the other fixes a performance regression for nbd in this series, where we inadvertently flipped when we set MSG_MORE when outputting data" * 'for-linus' of git://git.kernel.dk/linux-block: nbd: only set MSG_MORE when we have more to send blk-mq: Remove unused variable
-
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spiLinus Torvalds authored
Pull spi fixes from Mark Brown: "The usual small smattering of driver specific fixes. A few bits that stand out here: - the R-Car patches adding fallbacks are just adding new compatible strings to the driver so that device trees are written in a more robustly future proof fashion, this isn't strictly a fix but it's just new IDs and it's better to get it into mainline sooner to improve the ABI - the DesignWare "switch to new API part 2" patch is actually a misleadingly titled fix for a bit that got missed in the original conversion" * tag 'spi-fix-v4.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: davinci: use dma_mapping_error() spi: spi-axi: Free resources on error path spi: pxa2xx: add missed break spi: dw-mid: switch to new dmaengine_terminate_* API (part 2) spi: dw: Make debugfs name unique between instances spi: sh-msiof: Do not use C++ style comment spi: armada-3700: Set mode bits correctly spi: armada-3700: fix unsigned compare than zero on irq spi: sh-msiof: Add R-Car Gen 2 and 3 fallback bindings spi: SPI_FSL_DSPI should depend on HAS_DMA
-
git://github.com/ceph/ceph-clientLinus Torvalds authored
Pull ceph fixes from Ilya Dryomov: "Three filesystem endianness fixes (one goes back to the 2.6 era, all marked for stable) and two fixups for this merge window's patches" * tag 'ceph-for-4.10-rc5' of git://github.com/ceph/ceph-client: ceph: fix bad endianness handling in parse_reply_info_extra ceph: fix endianness bug in frag_tree_split_cmp ceph: fix endianness of getattr mask in ceph_d_revalidate libceph: make sure ceph_aes_crypt() IV is aligned ceph: fix ceph_get_caps() interruption
-
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfsLinus Torvalds authored
Pull overlayfs fix from Miklos Szeredi: "This fixes a regression introduced in this cycle" * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: fix possible use after free on redirect dir lookup
-
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuseLinus Torvalds authored
Pull fuse fixes from Miklos Szeredi: "Fix two regressions, one introduced in 4.9 and a less recent one in 4.2" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: fix time_to_jiffies nsec sanity check fuse: clear FR_PENDING flag when moving requests out of pending queue
-