- 21 Apr, 2015 16 commits
-
-
shli@kernel.org authored
If io error happens in any stripe of a batch list, the batch list will be split, then normal process will run for the stripes in the list. Signed-off-by: Shaohua Li <shli@fusionio.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
shli@kernel.org authored
stripe cache is 4k size. Even adjacent full stripe writes are handled in 4k unit. Idealy we should use big size for adjacent full stripe writes. Bigger stripe cache size means less stripes runing in the state machine so can reduce cpu overhead. And also bigger size can cause bigger IO size dispatched to under layer disks. With below patch, we will automatically batch adjacent full stripe write together. Such stripes will be added to the batch list. Only the first stripe of the list will be put to handle_list and so run handle_stripe(). Some steps of handle_stripe() are extended to cover all stripes of the list, including ops_run_io, ops_run_biodrain and so on. With this patch, we have less stripes running in handle_stripe() and we send IO of whole stripe list together to increase IO size. Stripes added to a batch list have some limitations. A batch list can only include full stripe write and can't cross chunk boundary to make sure stripes have the same parity disks. Stripes in a batch list must be in the same state (no written, toread and so on). If a stripe is in a batch list, all new read/write to add_stripe_bio will be blocked to overlap conflict till the batch list is handled. The limitations will make sure stripes in a batch list be in exactly the same state in the life circly. I did test running 160k randwrite in a RAID5 array with 32k chunk size and 6 PCIe SSD. This patch improves around 30% performance and IO size to under layer disk is exactly 32k. I also run a 4k randwrite test in the same array to make sure the performance isn't changed with the patch. Signed-off-by: Shaohua Li <shli@fusionio.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
shli@kernel.org authored
Track overwrite disk count, so we can know if a stripe is a full stripe write. Signed-off-by: Shaohua Li <shli@fusionio.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
shli@kernel.org authored
A freshly new stripe with write request can be batched. Any time the stripe is handled or new read is queued, the flag will be cleared. Signed-off-by: Shaohua Li <shli@fusionio.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
shli@kernel.org authored
Use flex_array for scribble data. Next patch will batch several stripes together, so scribble data should be able to cover several stripes, so this patch also allocates scribble data for stripes across a chunk. Signed-off-by: Shaohua Li <shli@fusionio.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
Heinz Mauelshagen authored
md raid0: access mddev->queue (request queue member) conditionally because it is not set when accessed from dm-raid The patch makes 3 references to mddev->queue in the raid0 personality conditional in order to allow for it to be accessed from dm-raid. Mandatory, because md instances underneath dm-raid don't manage a request queue of their own which'd lead to oopses without the patch. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Tested-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
When md notices non-sync IO happening while it is trying to resync (or reshape or recover) it slows down to the set minimum. The default minimum might have made sense many years ago but the drives have become faster. Changing the default to match the times isn't really a long term solution. This patch changes the code so that instead of waiting until the speed has dropped to the target, it just waits until pending requests have completed. This means that the delay inserted is a function of the speed of the devices. Testing shows that: - for some loads, the resync speed is unchanged. For those loads increasing the minimum doesn't change the speed either. So this is a good result. To increase resync speed under such loads we would probably need to increase the resync window size. - for other loads, resync speed does increase to a reasonable fraction (e.g. 20%) of maximum possible, and throughput of the load only drops a little bit (e.g. 10%) - for other loads, throughput of the non-sync load drops quite a bit more. These seem to be latency-sensitive loads. So it isn't a perfect solution, but it is mostly an improvement. Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
This option is not well justified and testing suggests that it hardly ever makes any difference. The comment suggests there might be a need to wait for non-resync activity indicated by ->nr_waiting, however raise_barrier() already waits for all of that. So just remove it to simplify reasoning about speed limiting. This allows us to remove a 'FIXME' comment from raid5.c as that never used the flag. Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
There is really no need for sync_min to be a multiple of chunk_size, and values read from here often aren't. That means you cannot read a value and expect to be able to write it back later. So remove the chunk_size check, and round down to a multiple of 4K, to be sure everything works with 4K-sector devices. Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
-
Goldwyn Rodrigues authored
When "re-add" is writted to /sys/block/mdXX/md/dev-YYY/state, the clustered md: 1. Sends RE_ADD message with the desc_nr. Nodes receiving the message clear the Faulty bit in their respective rdev->flags. 2. The node initiating re-add, gathers the bitmaps of all nodes and copies them into the local bitmap. It does not clear the bitmap from which it is copying. 3. Initiating node schedules a md recovery to sync the devices. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
Goldwyn Rodrigues authored
This adds the capability of re-adding a failed disk by writing "re-add" to /sys/block/mdXX/md/dev-YYY/state. This facilitates adding disks which have encountered a temporary error such as a network disconnection/hiccup in an iSCSI device, or a SAN cable disconnection which has been restored. In such a situation, you do not need to remove and re-add the device. Writing re-add to the failed device's state would add it again to the array and perform the recovery of only the blocks which were written after the device failed. This works for generic md, and is not related to clustering. However, this patch is to ease re-add operations listed above in clustering environments. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
Goldwyn Rodrigues authored
This adds "remove" capabilities for the clustered environment. When a user initiates removal of a device from the array, a REMOVE message with disk number in the array is sent to all the nodes which kick the respective device in their own array. This facilitates the removal of failed devices. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
Goldwyn Rodrigues authored
This is required by the clustering module (patches to follow) to find the device to remove or re-add. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
Goldwyn Rodrigues authored
This export is required for clustering module in order to co-ordinate remove/readd a rdev from all nodes. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
Guoqing Jiang authored
Since the node num of md-cluster is from zero, and cinfo->slot_number represents the slot num of dlm, no need to check for equality. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
- 10 Apr, 2015 1 commit
-
-
NeilBrown authored
Since commit 20d0189b in v3.14-rc1 RAID0 has performed incorrect calculations when the chunksize is not a power of 2. This happens because "sector_div()" modifies its first argument, but this wasn't taken into account in the patch. So restore that first arg before re-using the variable. Reported-by: Joe Landman <joe.landman@gmail.com> Reported-by: Dave Chinner <david@fromorbit.com> Fixes: 20d0189b Cc: stable@vger.kernel.org (3.14 and later). Signed-off-by: NeilBrown <neilb@suse.de>
-
- 08 Apr, 2015 1 commit
-
-
Gu Zheng authored
Simon reported the md io stats accounting issue: " I'm seeing "iostat -x -k 1" print this after a RAID1 rebuild on 4.0-rc5. It's not abnormal other than it's 3-disk, with one being SSD (sdc) and the other two being write-mostly: Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 345.00 0.00 0.00 0.00 0.00 100.00 md2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 58779.00 0.00 0.00 0.00 0.00 100.00 md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 12.00 0.00 0.00 0.00 0.00 100.00 " The cause is commit "18c0b223" uses the generic_start_io_acct to account the disk stats rather than the open code, but it also introduced the increase to .in_flight[rw] which is needless to md. So we re-use the open code here to fix it. Reported-by: Simon Kirby <sim@hostway.ca> Cc: <stable@vger.kernel.org> 3.19 Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
- 06 Apr, 2015 8 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) In TCP, don't register an FRTO for cumulatively ACK'd data that was previously SACK'd, from Neal Cardwell. 2) Need to hold RNL mutex in ipv4 multicast code namespace cleanup, from Cong WANG. 3) Similarly we have to hold RNL mutex for fib_rules_unregister(), also from Cong WANG. 4) Revert and rework netns nsid allocation fix, from Nicolas Dichtel. 5) When we encapsulate for a tunnel device, skb->sk still points to the user socket. So this leads to cases where we retraverse the ipv4/ipv6 output path with skb->sk being of some other address family (f.e. AF_PACKET). This can cause things to crash since the ipv4 output path is dereferencing an AF_PACKET socket as if it were an ipv4 one. The short term fix for 'net' and -stable is to elide these socket checks once we've entered an encapsulation sequence by testing xmit_recursion. Longer term we have a better solution wherein we pass the tunnel's socket down through the output paths, but that is way too invasive for 'net' and -stable. From Hannes Frederic Sowa. 6) l2tp_init() failure path forgets to unregister per-net ops, from Cong WANG. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: net/mlx4_core: Fix error message deprecation for ConnectX-2 cards net: dsa: fix filling routing table from OF description l2tp: unregister l2tp_net_ops on failure path mvneta: dont call mvneta_adjust_link() manually ipv6: protect skb->sk accesses from recursive dereference inside the stack netns: don't allocate an id for dead netns Revert "netns: don't clear nsid too early on removal" ip6mr: call del_timer_sync() in ip6mr_free_table() net: move fib_rules_unregister() under rtnl lock ipv4: take rtnl_lock and mark mrt table as freed on namespace cleanup tcp: fix FRTO undo on cumulative ACK of SACKed range xen-netfront: transmit fully GSO-sized packets
-
Jack Morgenstein authored
Commit 1daa4303 ("net/mlx4_core: Deprecate error message at ConnectX-2 cards startup to debug") did the deprecation only for port 1 of the card. Need to deprecate for port 2 as well. Fixes: 1daa4303 ("net/mlx4_core: Deprecate error message at ConnectX-2 cards startup to debug") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Pavel Nakonechny authored
According to description in 'include/net/dsa.h', in cascade switches configurations where there are more than one interconnected devices, 'rtable' array in 'dsa_chip_data' structure is used to indicate which port on this switch should be used to send packets to that are destined for corresponding switch. However, dsa_of_setup_routing_table() fills 'rtable' with port numbers of the _target_ switch, but not current one. This commit removes redundant devicetree parsing and adds needed port number as a function argument. So dsa_of_setup_routing_table() now just looks for target switch number by parsing parent of 'link' device node. To remove possible misunderstandings with the way of determining target switch number, a corresponding comment was added to the source code and to the DSA device tree bindings documentation file. This was tested on a custom board with two Marvell 88E6095 switches with following corresponding routing tables: { -1, 10 } and { 8, -1 }. Signed-off-by: Pavel Nakonechny <pavel.nakonechny@skitlab.ru> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/inputLinus Torvalds authored
Pull input fixes from Dmitry Torokhov: "Updates for the input subsystem - two more tweaks for ALPS driver to work out kinks after splitting the touchpad, trackstick, and potential external PS/2 mouse into separate input devices. Changes to support ALPS SS4 devices (protocol V8) will be coming in 4.1..." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: alps - document stick behavior for protocol V2 Input: alps - report V2 Dualpoint Stick events via the right evdev node Input: alps - report interleaved bare PS/2 packets via dev3
-
WANG Cong authored
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stas Sergeev authored
mvneta_adjust_link() is a callback for of_phy_connect() and should not be called directly. The result of calling it directly is as below: Signed-off-by: David S. Miller <davem@davemloft.net>
-
hannes@stressinduktion.org authored
We should not consult skb->sk for output decisions in xmit recursion levels > 0 in the stack. Otherwise local socket settings could influence the result of e.g. tunnel encapsulation process. ipv6 does not conform with this in three places: 1) ip6_fragment: we do consult ipv6_npinfo for frag_size 2) sk_mc_loop in ipv6 uses skb->sk and checks if we should loop the packet back to the local socket 3) ip6_skb_dst_mtu could query the settings from the user socket and force a wrong MTU Furthermore: In sk_mc_loop we could potentially land in WARN_ON(1) if we use a PF_PACKET socket ontop of an IPv6-backed vxlan device. Reuse xmit_recursion as we are currently only interested in protecting tunnel devices. Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 05 Apr, 2015 3 commits
-
-
Hans de Goede authored
Document that protocol V2 uses standard (bare) PS/2 mouse packets for the DualPoint stick. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-By: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
-
Hans de Goede authored
On V2 devices the DualPoint Stick reports bare packets, these should be reported via the "AlpsPS/2 ALPS DualPoint Stick" dev2 evdev node, which also has the INPUT_PROP_POINTING_STICK propbit set. Note that since there is no way to distinguish these packets from an external PS/2 mouse (insofar as these laptops have an external PS/2 port) this means that we will be reporting PS/2 mouse events via this evdev node too, as we've been doing in kernel 3.19 and older. This has been tested on a Dell Latitude D620 and a Dell Latitude E6400, which both have a V2 touchpad + a DualPoint Stick which reports bare packets. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
-
Hans de Goede authored
Bare packets should be reported via the same evdev device independent on whether they are detected on the beginning of a packet or in the middle of a packet. This has been tested on a Dell Latitude E6400, where the DualPoint Stick reports bare packets, which get reported via dev3 when the touchpad is idle, and via dev2 when the touchpad and stick are used simultaneously. This commit fixes this inconsistency by always reporting bare packets via dev3. Note that since the come from a DualPoint Stick they really should be reported via dev2, this gets fixed in a later commit. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
-
- 04 Apr, 2015 3 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usbLinus Torvalds authored
Pull USB fixes from Greg KH: "Here are some small USB fixes and new device ids for 4.0-rc6. Nothing major, some xhci fixes for reported problems, and some usb-serial device ids. All have been in linux-next for a while" * tag 'usb-4.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: USB: ftdi_sio: Use jtag quirk for SNAP Connect E10 usb: isp1760: fix spin unlock in the error path of isp1760_udc_start usb: xhci: apply XHCI_AVOID_BEI quirk to all Intel xHCI controllers usb: xhci: handle Config Error Change (CEC) in xhci driver USB: keyspan_pda: add new device id USB: ftdi_sio: Added custom PID for Synapse Wireless product
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/stagingLinus Torvalds authored
Pull staging driver fixes from Greg KH: "Here are some staging driver fixes, well, really all just IIO driver fixes, for 4.0-rc6. They fix issues that have been reported with these drivers. All of these patches have been in linux-next for a while" * tag 'staging-4.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: iio: imu: Use iio_trigger_get for indio_dev->trig assignment iio: adc: vf610: use ADC clock within specification iio/adc/cc10001_adc.c: Fix !HAS_IOMEM build iio: core: Fix double free. iio:inv-mpu6050: Fix inconsistency for the scale channel staging: iio: dummy: Fix undefined symbol build error iio: inv_mpu6050: Clear timestamps fifo while resetting hardware fifo staging: iio: hmc5843: Set iio name property in sysfs iio: bmc150: change sampling frequency iio: fix drivers that check buffer->scan_mask
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/ttyLinus Torvalds authored
Pull tty/serial fixes from Greg KH: "Here are 3 serial driver fixes for 4.0-rc6. They fix some reported issues with the samsung and fsl_lpuart drivers. All have been in linux-next for a while" * tag 'tty-4.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: tty: serial: fsl_lpuart: clear receive flag on FIFO flush tty: serial: fsl_lpuart: specify transmit FIFO size serial: samsung: Clear operation mode on UART shutdown
-
- 03 Apr, 2015 8 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/inputLinus Torvalds authored
Pull input subsystem fixes from Dmitry Torokhov: "A fix for ALPS driver for issue introduced in the latest update and a tweak for yet another Lenovo box in Synaptics. There will be more ALPS tweaks coming.." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: define INPUT_PROP_ACCELEROMETER behavior Input: synaptics - fix min-max quirk value for E440 Input: synaptics - add quirk for Thinkpad E440 Input: ALPS - fix max coordinates for v5 and v7 protocols Input: add MT_TOOL_PALM
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull block layer fix from Jens Axboe: "Just one patch in this pull request, fixing a regression caused by a 'mathematically correct' change to lcm()" * 'for-linus' of git://git.kernel.dk/linux-block: block: fix blk_stack_limits() regression due to lcm() change
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fixes from Ingo Molnar: "Misc fixes: a SYSRET single-stepping fix, a dmi-scan robustization fix, a reboot quirk and a kgdb fixlet" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kgdb/x86: Fix reporting of 'si' in kgdb on x86_64 x86/asm/entry/64: Disable opportunistic SYSRET if regs->flags has TF set x86/reboot: Add ASRock Q1900DC-ITX mainboard reboot quirk MAINTAINERS: Change the x86 microcode loader maintainer firmware: dmi_scan: Prevent dmi_num integer overflow
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull perf fixes from Ingo Molnar: "Two x86 Intel PMU constraint handling fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Fix Haswell CYCLE_ACTIVITY.* counter constraints perf/x86/intel: Filter branches for PEBS event
-
git://git.kernel.org/pub/scm/linux/kernel/git/glikely/linuxLinus Torvalds authored
Pull devicetree fix from Grant Likely: "Simple bugfix for bad device tree data on the PA-Semi platform" * tag 'devicetree-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/glikely/linux: drivers/of: Add empty ranges quirk for PA-Semi
-
git://git.samba.org/sfrench/cifs-2.6Linus Torvalds authored
Pull CIFS fixes from Steve French: "A set of small cifs fixes fixing a memory leak, kernel oops, and infinite loop (and some spotted by Coverity)" * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: Fix warning Fix another dereference before null check warning CIFS: session servername can't be null Fix warning on impossible comparison Fix coverity warning Fix dereference before null check warning Don't ignore errors on encrypting password in SMBTcon Fix warning on uninitialized buftype cifs: potential memory leaks when parsing mnt opts cifs: fix use-after-free bug in find_writable_file cifs: smb2_clone_range() - exit on unhandled error
-
Nicolas Dichtel authored
First, let's explain the problem. Suppose you have an ipip interface that stands in the netns foo and its link part in the netns bar (so the netns bar has an nsid into the netns foo). Now, you remove the netns bar: - the bar nsid into the netns foo is removed - the netns exit method of ipip is called, thus our ipip iface is removed: => a netlink message is built in the netns foo to advertise this deletion => this netlink message requests an nsid for bar, thus a new nsid is allocated for bar and never removed. This patch adds a check in peernet2id() so that an id cannot be allocated for a netns which is currently destroyed. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nicolas Dichtel authored
This reverts commit 4217291e ("netns: don't clear nsid too early on removal"). This is not the right fix, it introduces races. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-