- 07 Mar, 2014 40 commits
-
-
Tejun Heo authored
commit 532de3fc upstream. Currently, there's nothing preventing cgroup_enable_task_cg_lists() from missing set PF_EXITING and race against cgroup_exit(). Depending on the timing, cgroup_exit() may finish with the task still linked on css_set leading to list corruption. Fix it by grabbing siglock in cgroup_enable_task_cg_lists() so that PF_EXITING is guaranteed to be visible. This whole on-demand cg_list optimization is extremely fragile and has ample possibility to lead to bugs which can cause things like once-a-year oops during boot. I'm wondering whether the better approach would be just adding "cgroup_disable=all" handling which disables the whole cgroup rather than tempting fate with this on-demand craziness. Signed-off-by:
Tejun Heo <tj@kernel.org> Acked-by:
Li Zefan <lizefan@huawei.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tejun Heo authored
commit 48573a89 upstream. cgroup_cfts_commit() walks the cgroup hierarchy that the target subsystem is attached to and tries to apply the file changes. Due to the convolution with inode locking, it can't keep cgroup_mutex locked while iterating. It currently holds only RCU read lock around the actual iteration and then pins the found cgroup using dget(). Unfortunately, this is incorrect. Although the iteration does check cgroup_is_dead() before invoking dget(), there's nothing which prevents the dentry from going away inbetween. Note that this is different from the usual css iterations where css_tryget() is used to pin the css - css_tryget() tests whether the css can be pinned and fails if not. The problem can be solved by simply holding cgroup_mutex instead of RCU read lock around the iteration, which actually reduces LOC. Signed-off-by:
Tejun Heo <tj@kernel.org> Acked-by:
Li Zefan <lizefan@huawei.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tejun Heo authored
commit b58c8998 upstream. cgroup_create() was returning 0 after allocation failures. Fix it. Signed-off-by:
Tejun Heo <tj@kernel.org> Acked-by:
Li Zefan <lizefan@huawei.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tejun Heo authored
commit eb46bf89 upstream. When cgroup_mount() fails to allocate an id for the root, it didn't set ret before jumping to unlock_drop ending up returning 0 after a failure. Fix it. Signed-off-by:
Tejun Heo <tj@kernel.org> Acked-by:
Li Zefan <lizefan@huawei.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hui Wang authored
commit 1de7ca5e upstream. The front headphone and mic jackes on a HP desktop model (Vendor Id: 0x111d76c7 Subsystem Id: 0x103c2b17) can not work, the codec on this machine has 8 physical ports, 6 of them are routed to rear jackes and all of them work very well, while the remaining 2 ports are routed to front headphone and mic jackes, but the corresponding pin complex node are not defined correctly. After apply this fix, the front audio jackes can work very well. [trivial fix of enum definition by tiwai] BugLink: https://bugs.launchpad.net/bugs/1282369 Cc: David Henningsson <david.henningsson@canonical.com> Tested-by:
Gerald Yang <gerald.yang@canonical.com> Signed-off-by:
Hui Wang <hui.wang@canonical.com> Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hsin-Yu Chao authored
commit 13c12dbe upstream. Incorrect ADC is picked in ca0132_capture_pcm_prepare(), where it assumes multiple streams while there is one stream per ADC. Note that ca0132_capture_pcm_cleanup() already does the right thing. The Chromebook Pixel has a microphone under the keyboard that is attached to node id 0x8. Before this fix, recording would always go to the main internal mic (node id 0x7). Signed-off-by:
Hsin-Yu Chao <hychao@chromium.org> Reviewed-by:
Dylan Reid <dgreid@chromium.org> Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hsin-Yu Chao authored
commit 28fba950 upstream. When a HDMI stream is opened with the same stream tag as a following opened stream to ca0132, audio will be heard from two ports simultaneously. Fix this issue by change to use snd_hda_codec_setup_stream and snd_hda_codec_cleanup_stream instead, so that an inactive stream can be marked as 'dirty' when found with a conflict stream tag, and then get purified. Signed-off-by:
Hsin-Yu Chao <hychao@chromium.org> Reviewed-by:
Chih-Chung Chang <chihchung@chromium.org> Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hui Wang authored
commit 4913e0bf upstream. When we plug a 3-ring headset on the Dell machines (Vendor ID: 0x10ec0255, Subsystem ID: 0x10280657; Vendor ID: 0x10ec0255, Subsystem ID: 0x1028065f), the headset mic can't be detected, after apply this patch, the headset mic can work well. BugLink: https://bugs.launchpad.net/bugs/1260303 Cc: David Henningsson <david.henningsson@canonical.com> Tested-by:
Cyrus Lien <cyrus.lien@canonical.com> Signed-off-by:
Hui Wang <hui.wang@canonical.com> Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Clemens Ladisch authored
commit 624aef49 upstream. When the driver tries to access Function Unit 10, the KEF X300A speakers' firmware apparently locks up, making even PCM streaming impossible. Work around this by ignoring this FU. Signed-off-by:
Clemens Ladisch <clemens@ladisch.de> Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Antonio Quartulli authored
[ Upstream commit 70b271a7 ] batadv_send_skb_prepare_unicast(_4addr) might reallocate the skb's data. If it does then our ethhdr pointer is not valid anymore in batadv_send_skb_unicast(), resulting in a kernel paging error. Fixing this by refetching the ethhdr pointer after the potential reallocation. Signed-off-by:
Linus Lüssing <linus.luessing@web.de> Signed-off-by:
Antonio Quartulli <antonio@meshcoding.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Antonio Quartulli authored
[ Upstream commit a5a5cb8c ] In the failure path of the orig_node initialization routine the orig_node->bat_iv.bcast_own field is free'd twice: first in batadv_iv_ogm_orig_get() and then later in batadv_orig_node_free_rcu(). Fix it by removing the kfree in batadv_iv_ogm_orig_get(). Signed-off-by:
Antonio Quartulli <antonio@meshcoding.com> Signed-off-by:
Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Antonio Quartulli authored
[ Upstream commit 05c3c8a6 ] When the TVLV parsing routine succeed the skb is left untouched thus leading to a memory leak. Fix this by consuming the skb in case of success. Introduced by ef261577 ("batman-adv: tvlv - basic infrastructure") Reported-by:
Russel Senior <russell@personaltelco.net> Signed-off-by:
Antonio Quartulli <antonio@open-mesh.com> Tested-by:
Russell Senior <russell@personaltelco.net> Signed-off-by:
Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Antonio Quartulli authored
[ Upstream commit a30e22ca ] When computing the CRC on a 2byte variable the order of the bytes obviously alters the final result. This means that computing the CRC over the same value on two archs having different endianess leads to different numbers. The global and local translation table CRC computation routine makes this mistake while processing the clients VIDs. The result is a continuous CRC mismatching between nodes having different endianess. Fix this by converting the VID to Network Order before processing it. This guarantees that every node uses the same byte order. Introduced by 7ea7b4a1 ("batman-adv: make the TT CRC logic VLAN specific") Reported-by:
Russel Senior <russell@personaltelco.net> Signed-off-by:
Antonio Quartulli <antonio@open-mesh.com> Tested-by:
Russell Senior <russell@personaltelco.net> Signed-off-by:
Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Simon Wunderlich authored
[ Upstream commit b2262df7 ] Since batadv_orig_node_new() sets the refcount to two, assuming that the calling function will use a reference for putting the orig_node into a hash or similar, both references must be freed if initialization of the orig_node fails. Otherwise that object may be leaked in that error case. Reported-by:
Antonio Quartulli <antonio@meshcoding.com> Signed-off-by:
Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by:
Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by:
Antonio Quartulli <antonio@meshcoding.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Antonio Quartulli authored
[ Upstream commit 08bf0ed2 ] When adding a new neighbour it is important to atomically perform the following: - check if the neighbour already exists - append the neighbour to the proper list If the two operations are not performed in an atomic context it is possible that two concurrent insertions add the same neighbour twice. Signed-off-by:
Antonio Quartulli <antonio@open-mesh.com> Signed-off-by:
Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Antonio Quartulli authored
[ Upstream commit f1791425 ] pskb_may_pull() returns 1 on success and 0 in case of failure, therefore checking for the return value being negative does not make sense at all. This way if the function fails we will probably read beyond the current skb data buffer. Fix this by doing the proper check. Signed-off-by:
Antonio Quartulli <antonio@meshcoding.com> Signed-off-by:
Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Antonio Quartulli authored
[ Upstream commit 91c2b1a9 ] There is a refcounter unbalance in the CRC checking routine invoked on OGM reception. A vlan object is retrieved (thus its refcounter is increased by one) but it is never properly released. This leads to a memleak because the vlan object will never be free'd. Fix this by releasing the vlan object after having read the CRC. Reported-by:
Russell Senior <russell@personaltelco.net> Reported-by:
Daniel <daniel@makrotopia.org> Reported-by:
cmsv <cmsv@wirelesspt.net> Signed-off-by:
Antonio Quartulli <antonio@meshcoding.com> Signed-off-by:
Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Antonio Quartulli authored
[ Upstream commit e889241f ] When accessing a TT-TVLV container in the OGM RX path the variable pointing to the list of changes to apply is altered by mistake. This makes the TT component read data at the wrong position in the OGM packet buffer. Fix it by removing the bogus pointer alteration. Signed-off-by:
Antonio Quartulli <antonio@meshcoding.com> Signed-off-by:
Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Antonio Quartulli authored
[ Upstream commit 930cd6e4 ] The current MTU computation always returns a value smaller than 1500bytes even if the real interfaces have an MTU large enough to compensate the batman-adv overhead. Fix the computation by properly returning the highest admitted value. Introduced by a19d3d85 ("batman-adv: limit local translation table max size") Reported-by:
Russell Senior <russell@personaltelco.net> Signed-off-by:
Antonio Quartulli <antonio@meshcoding.com> Signed-off-by:
Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Dumazet authored
[ Upstream commit ed98df33 ] sock_alloc_send_pskb() & sk_page_frag_refill() have a loop trying high order allocations to prepare skb with low number of fragments as this increases performance. Problem is that under memory pressure/fragmentation, this can trigger OOM while the intent was only to try the high order allocations, then fallback to order-0 allocations. We had various reports from unexpected regressions. According to David, setting __GFP_NORETRY should be fine, as the asynchronous compaction is still enabled, and this will prevent OOM from kicking as in : CFSClientEventm invoked oom-killer: gfp_mask=0x42d0, order=3, oom_adj=0, oom_score_adj=0, oom_score_badness=2 (enabled),memcg_scoring=disabled CFSClientEventm Call Trace: [<ffffffff8043766c>] dump_header+0xe1/0x23e [<ffffffff80437a02>] oom_kill_process+0x6a/0x323 [<ffffffff80438443>] out_of_memory+0x4b3/0x50d [<ffffffff8043a4a6>] __alloc_pages_may_oom+0xa2/0xc7 [<ffffffff80236f42>] __alloc_pages_nodemask+0x1002/0x17f0 [<ffffffff8024bd23>] alloc_pages_current+0x103/0x2b0 [<ffffffff8028567f>] sk_page_frag_refill+0x8f/0x160 [<ffffffff80295fa0>] tcp_sendmsg+0x560/0xee0 [<ffffffff802a5037>] inet_sendmsg+0x67/0x100 [<ffffffff80283c9c>] __sock_sendmsg_nosec+0x6c/0x90 [<ffffffff80283e85>] sock_sendmsg+0xc5/0xf0 [<ffffffff802847b6>] __sys_sendmsg+0x136/0x430 [<ffffffff80284ec8>] sys_sendmsg+0x88/0x110 [<ffffffff80711472>] system_call_fastpath+0x16/0x1b Out of Memory: Kill process 2856 (bash) score 9999 or sacrifice child Signed-off-by:
Eric Dumazet <edumazet@google.com> Acked-by:
David Rientjes <rientjes@google.com> Acked-by:
"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
willy tarreau authored
[ Upstream commit 71f6d1b3 ] Right now the mvneta driver doesn't handle Tx IRQ, and relies on two mechanisms to flush Tx descriptors : a flush at the end of mvneta_tx() and a timer. If a burst of packets is emitted faster than the device can send them, then the queue is stopped until next wake-up of the timer 10ms later. This causes jerky output traffic with bursts and pauses, making it difficult to reach line rate with very few streams. A test on UDP traffic shows that it's not possible to go beyond 134 Mbps / 12 kpps of outgoing traffic with 1500-bytes IP packets. Routed traffic tends to observe pauses as well if the traffic is bursty, making it even burstier after the wake-up. It seems that this feature was inherited from the original driver but nothing there mentions any reason for not using the interrupt instead, which the chip supports. Thus, this patch enables Tx interrupts and removes the timer. It does the two at once because it's not really possible to make the two mechanisms coexist, so a split patch doesn't make sense. First tests performed on a Mirabox (Armada 370) show that less CPU seems to be used when sending traffic. One reason might be that we now call the mvneta_tx_done_gbe() with a mask indicating which queues have been done instead of looping over all of them. The same UDP test above now happily reaches 987 Mbps / 87.7 kpps. Single-stream TCP traffic can now more easily reach line rate. HTTP transfers of 1 MB objects over a single connection went from 730 to 840 Mbps. It is even possible to go significantly higher (>900 Mbps) by tweaking tcp_tso_win_divisor. Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: Gregory CLEMENT <gregory.clement@free-electrons.com> Cc: Arnaud Ebalard <arno@natisbad.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Tested-by:
Arnaud Ebalard <arno@natisbad.org> Signed-off-by:
Willy Tarreau <w@1wt.eu> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
willy tarreau authored
[ Upstream commit 40ba35e7 ] Marvell has not published the chip's datasheet yet, so it's very hard to find the relevant bits to manipulate to change the IRQ behaviour. Fortunately, these bits are described in the proprietary LSP patch set which is publicly available here : http://www.plugcomputer.org/downloads/mirabox/ So let's put them back in the driver in order to reduce the burden of current and future maintenance. Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: Gregory CLEMENT <gregory.clement@free-electrons.com> Tested-by:
Arnaud Ebalard <arno@natisbad.org> Signed-off-by:
Willy Tarreau <w@1wt.eu> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
willy tarreau authored
[ Upstream commit 29021366 ] If a queue timeout is reported, we can oops because of some schedules while the caller is atomic, as shown below : mvneta d0070000.ethernet eth0: tx timeout BUG: scheduling while atomic: bash/1528/0x00000100 Modules linked in: slhttp_ethdiv(C) [last unloaded: slhttp_ethdiv] CPU: 2 PID: 1528 Comm: bash Tainted: G WC 3.13.0-rc4-mvebu-nf #180 [<c0011bd9>] (unwind_backtrace+0x1/0x98) from [<c000f1ab>] (show_stack+0xb/0xc) [<c000f1ab>] (show_stack+0xb/0xc) from [<c02ad323>] (dump_stack+0x4f/0x64) [<c02ad323>] (dump_stack+0x4f/0x64) from [<c02abe67>] (__schedule_bug+0x37/0x4c) [<c02abe67>] (__schedule_bug+0x37/0x4c) from [<c02ae261>] (__schedule+0x325/0x3ec) [<c02ae261>] (__schedule+0x325/0x3ec) from [<c02adb97>] (schedule_timeout+0xb7/0x118) [<c02adb97>] (schedule_timeout+0xb7/0x118) from [<c0020a67>] (msleep+0xf/0x14) [<c0020a67>] (msleep+0xf/0x14) from [<c01dcbe5>] (mvneta_stop_dev+0x21/0x194) [<c01dcbe5>] (mvneta_stop_dev+0x21/0x194) from [<c01dcfe9>] (mvneta_tx_timeout+0x19/0x24) [<c01dcfe9>] (mvneta_tx_timeout+0x19/0x24) from [<c024afc7>] (dev_watchdog+0x18b/0x1c4) [<c024afc7>] (dev_watchdog+0x18b/0x1c4) from [<c0020b53>] (call_timer_fn.isra.27+0x17/0x5c) [<c0020b53>] (call_timer_fn.isra.27+0x17/0x5c) from [<c0020cad>] (run_timer_softirq+0x115/0x170) [<c0020cad>] (run_timer_softirq+0x115/0x170) from [<c001ccb9>] (__do_softirq+0xbd/0x1a8) [<c001ccb9>] (__do_softirq+0xbd/0x1a8) from [<c001cfad>] (irq_exit+0x61/0x98) [<c001cfad>] (irq_exit+0x61/0x98) from [<c000d4bf>] (handle_IRQ+0x27/0x60) [<c000d4bf>] (handle_IRQ+0x27/0x60) from [<c000843b>] (armada_370_xp_handle_irq+0x33/0xc8) [<c000843b>] (armada_370_xp_handle_irq+0x33/0xc8) from [<c000fba9>] (__irq_usr+0x49/0x60) Ben Hutchings attempted to propose a better fix consisting in using a scheduled work for this, but while it fixed this panic, it caused other random freezes and panics proving that the reset sequence in the driver is unreliable and that additional fixes should be investigated. When sending multiple streams over a link limited to 100 Mbps, Tx timeouts happen from time to time, and the driver correctly recovers only when the function is disabled. Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: Gregory CLEMENT <gregory.clement@free-electrons.com> Cc: Ben Hutchings <ben@decadent.org.uk> Tested-by:
Arnaud Ebalard <arno@natisbad.org> Signed-off-by:
Willy Tarreau <w@1wt.eu> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
willy tarreau authored
[ Upstream commit 74c41b04 ] Stats writers are mvneta_rx() and mvneta_tx(). They don't lock anything when they update the stats, and as a result, it randomly happens that the stats freeze on SMP if two updates happen during stats retrieval. This is very easily reproducible by starting two HTTP servers and binding each of them to a different CPU, then consulting /proc/net/dev in loops during transfers, the interface should immediately lock up. This issue also randomly happens upon link state changes during transfers, because the stats are collected in this situation, but it takes more attempts to reproduce it. The comments in netdevice.h suggest using per_cpu stats instead to get rid of this issue. This patch implements this. It merges both rx_stats and tx_stats into a single "stats" member with a single syncp. Both mvneta_rx() and mvneta_rx() now only update the a single CPU's counters. In turn, mvneta_get_stats64() does the summing by iterating over all CPUs to get their respective stats. With this change, stats are still correct and no more lockup is encountered. Note that this bug was present since the first import of the mvneta driver. It might make sense to backport it to some stable trees. If so, it depends on "d33dc73 net: mvneta: increase the 64-bit rx/tx stats out of the hot path". Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: Gregory CLEMENT <gregory.clement@free-electrons.com> Reviewed-by:
Eric Dumazet <edumazet@google.com> Tested-by:
Arnaud Ebalard <arno@natisbad.org> Signed-off-by:
Willy Tarreau <w@1wt.eu> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
willy tarreau authored
[ Upstream commit dc4277dd ] Better count packets and bytes in the stack and on 32 bit then accumulate them at the end for once. This saves two memory writes and two memory barriers per packet. The incoming packet rate was increased by 4.7% on the Openblocks AX3 thanks to this. Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: Gregory CLEMENT <gregory.clement@free-electrons.com> Reviewed-by:
Eric Dumazet <edumazet@google.com> Tested-by:
Arnaud Ebalard <arno@natisbad.org> Signed-off-by:
Willy Tarreau <w@1wt.eu> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Florian Westphal authored
commit fe6cc55f upstream. Marcelo Ricardo Leitner reported problems when the forwarding link path has a lower mtu than the incoming one if the inbound interface supports GRO. Given: Host <mtu1500> R1 <mtu1200> R2 Host sends tcp stream which is routed via R1 and R2. R1 performs GRO. In this case, the kernel will fail to send ICMP fragmentation needed messages (or pkt too big for ipv6), as GSO packets currently bypass dstmtu checks in forward path. Instead, Linux tries to send out packets exceeding the mtu. When locking route MTU on Host (i.e., no ipv4 DF bit set), R1 does not fragment the packets when forwarding, and again tries to send out packets exceeding R1-R2 link mtu. This alters the forwarding dstmtu checks to take the individual gso segment lengths into account. For ipv6, we send out pkt too big error for gso if the individual segments are too big. For ipv4, we either send icmp fragmentation needed, or, if the DF bit is not set, perform software segmentation and let the output path create fragments when the packet is leaving the machine. It is not 100% correct as the error message will contain the headers of the GRO skb instead of the original/segmented one, but it seems to work fine in my (limited) tests. Eric Dumazet suggested to simply shrink mss via ->gso_size to avoid sofware segmentation. However it turns out that skb_segment() assumes skb nr_frags is related to mss size so we would BUG there. I don't want to mess with it considering Herbert and Eric disagree on what the correct behavior should be. Hannes Frederic Sowa notes that when we would shrink gso_size skb_segment would then also need to deal with the case where SKB_MAX_FRAGS would be exceeded. This uses sofware segmentation in the forward path when we hit ipv4 non-DF packets and the outgoing link mtu is too small. Its not perfect, but given the lack of bug reports wrt. GRO fwd being broken this is a rare case anyway. Also its not like this could not be improved later once the dust settles. Acked-by:
Herbert Xu <herbert@gondor.apana.org.au> Reported-by:
Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Florian Westphal authored
commit d2069403 upstream. Will be used by upcoming ipv4 forward path change that needs to determine feature mask using skb->dst->dev instead of skb->dev. Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Florian Westphal authored
commit de960aa9 upstream. This moves part of Eric Dumazets skb_gso_seglen helper from tbf sched to skbuff core so it may be reused by upcoming ip forwarding path patch. Signed-off-by:
Florian Westphal <fw@strlen.de> Acked-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Daniel Borkmann authored
[ Upstream commit ffd59393 ] SCTP's sctp_connectx() abi breaks for 64bit kernels compiled with 32bit emulation (e.g. ia32 emulation or x86_x32). Due to internal usage of 'struct sctp_getaddrs_old' which includes a struct sockaddr pointer, sizeof(param) check will always fail in kernel as the structure in 64bit kernel space is 4bytes larger than for user binaries compiled in 32bit mode. Thus, applications making use of sctp_connectx() won't be able to run under such circumstances. Introduce a compat interface in the kernel to deal with such situations by using a 'struct compat_sctp_getaddrs_old' structure where user data is copied into it, and then sucessively transformed into a 'struct sctp_getaddrs_old' structure with the help of compat_ptr(). That fixes sctp_connectx() abi without any changes needed in user space, and lets the SCTP test suite pass when compiled in 32bit and run on 64bit kernels. Fixes: f9c67811 ("sctp: Fix regression introduced by new sctp_connectx api") Signed-off-by:
Daniel Borkmann <dborkman@redhat.com> Acked-by:
Neil Horman <nhorman@tuxdriver.com> Acked-by:
Vlad Yasevich <vyasevich@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Duan Jiong authored
[ Upstream commit a6254864 ] since commit 89aef892("ipv4: Delete routing cache."), the counter in_slow_tot can't work correctly. The counter in_slow_tot increase by one when fib_lookup() return successfully in ip_route_input_slow(), but actually the dst struct maybe not be created and cached, so we can increase in_slow_tot after the dst struct is created. Signed-off-by:
Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jiri Bohac authored
[ Upstream commit 163c8ff3 ] aggregator_identifier is used to assign unique aggregator identifiers to aggregators of a bond during device enslaving. aggregator_identifier is currently a global variable that is zeroed in bond_3ad_initialize(). This sequence will lead to duplicate aggregator identifiers for eth1 and eth3: create bond0 change bond0 mode to 802.3ad enslave eth0 to bond0 //eth0 gets agg id 1 enslave eth1 to bond0 //eth1 gets agg id 2 create bond1 change bond1 mode to 802.3ad enslave eth2 to bond1 //aggregator_identifier is reset to 0 //eth2 gets agg id 1 enslave eth3 to bond0 //eth3 gets agg id 2 Fix this by making aggregator_identifier private to the bond. Signed-off-by:
Jiri Bohac <jbohac@suse.cz> Acked-by:
Veaceslav Falico <vfalico@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Emil Goode authored
[ Upstream commit eb85569f ] This patch removes a generic hard_header_len check from the usbnet module that is causing dropped packages under certain circumstances for devices that send rx packets that cross urb boundaries. One example is the AX88772B which occasionally send rx packets that cross urb boundaries where the remaining partial packet is sent with no hardware header. When the buffer with a partial packet is of less number of octets than the value of hard_header_len the buffer is discarded by the usbnet module. With AX88772B this can be reproduced by using ping with a packet size between 1965-1976. The bug has been reported here: https://bugzilla.kernel.org/show_bug.cgi?id=29082 This patch introduces the following changes: - Removes the generic hard_header_len check in the rx_complete function in the usbnet module. - Introduces a ETH_HLEN check for skbs that are not cloned from within a rx_fixup callback. - For safety a hard_header_len check is added to each rx_fixup callback function that could be affected by this change. These extra checks could possibly be removed by someone who has the hardware to test. - Removes a call to dev_kfree_skb_any() and instead utilizes the dev->done list to queue skbs for cleanup. The changes place full responsibility on the rx_fixup callback functions that clone skbs to only pass valid skbs to the usbnet_skb_return function. Signed-off-by:
Emil Goode <emilgoode@gmail.com> Reported-by:
Igor Gnatenko <i.gnatenko.brain@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Nicolas Dichtel authored
[ Upstream commit 08b44656 ] This bug was reported by Steinar H. Gunderson and was introduced by commit f7cb8886 ("sit/gre6: don't try to add the same route two times"). root@morgental:~# ip tunnel add foo mode gre remote 1.2.3.4 ttl 64 root@morgental:~# ip link set foo up mtu 1468 root@morgental:~# ip -6 route show dev foo fe80::/64 proto kernel metric 256 but after the above commit, no such route shows up. There is no link local route because dev->dev_addr is 0 (because local ipv4 address is 0), hence no link local address is configured. In this scenario, the link local address is added manually: 'ip -6 addr add fe80::1 dev foo' and because prefix is /128, no link local route is added by the kernel. Even if the right things to do is to add the link local address with a /64 prefix, we need to restore the previous behavior to avoid breaking userpace. Reported-by:
Steinar H. Gunderson <sesse@samfundet.no> Signed-off-by:
Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Emil Goode authored
[ Upstream commit d43ff4cd ] The struct driver_info ax88178_info is assigned the function asix_rx_fixup_common as it's rx_fixup callback. This means that FLAG_MULTI_PACKET must be set as this function is cloning the data and calling usbnet_skb_return. Not setting this flag leads to usbnet_skb_return beeing called a second time from within the rx_process function in the usbnet module. Signed-off-by:
Emil Goode <emilgoode@gmail.com> Reported-by:
Bjørn Mork <bjorn@mork.no> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Haiyang Zhang authored
[ Upstream commit 891de74d ] Without this patch, the "cat /sys/class/net/ethN/operstate" shows "unknown", and "ethtool ethN" shows "Link detected: yes", when VM boots up with or without vNIC connected. This patch fixed the problem. Signed-off-by:
Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by:
K. Y. Srinivasan <kys@microsoft.com> Acked-by:
Jason Wang <jasowang@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Michael S. Tsirkin authored
[ Upstream commit 0ad8b480 ] vhost checked the counter within the refcnt before decrementing. It really wanted to know that it is the one that has the last reference, as a way to batch freeing resources a bit more efficiently. Note: we only let refcount go to 0 on device release. This works well but we now access the ref counter twice so there's a race: all users might see a high count and decide to defer freeing resources. In the end no one initiates freeing resources until the last reference is gone (which is on VM shotdown so might happen after a looooong time). Let's do what we probably should have done straight away: switch from kref to plain atomic, documenting the semantics, return the refcount value atomically after decrement, then use that to avoid the deadlock. Reported-by:
Qin Chuanyu <qinchuanyu@huawei.com> Signed-off-by:
Michael S. Tsirkin <mst@redhat.com> Acked-by:
Jason Wang <jasowang@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Nithin Sujir authored
[ Upstream commit c6993dfd ] Quoting David Vrabel - "5780 cards cannot have jumbo frames and TSO enabled together. When jumbo frames are enabled by setting the MTU, the TSO feature must be cleared. This is done indirectly by calling netdev_update_features() which will call tg3_fix_features() to actually clear the flags. netdev_update_features() will also trigger a new netlink message for the feature change event which will result in a call to tg3_get_stats64() which deadlocks on the tg3 lock." tg3_set_mtu() does not need to be under the tg3 lock since converting the flags to use set_bit(). Move it out to after tg3_netif_stop(). Reported-by:
David Vrabel <david.vrabel@citrix.com> Tested-by:
David Vrabel <david.vrabel@citrix.com> Signed-off-by:
Michael Chan <mchan@broadcom.com> Signed-off-by:
Nithin Nayak Sujir <nsujir@broadcom.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
John Ogness authored
[ Upstream commit bf06200e ] Commit 46d3ceab ("tcp: TCP Small Queues") introduced a possible regression for applications using TCP_NODELAY. If TCP session is throttled because of tsq, we should consult tp->nonagle when TX completion is done and allow us to send additional segment, especially if this segment is not a full MSS. Otherwise this segment is sent after an RTO. [edumazet] : Cooked the changelog, added another fix about testing sk_wmem_alloc twice because TX completion can happen right before setting TSQ_THROTTLED bit. This problem is particularly visible with recent auto corking, but might also be triggered with low tcp_limit_output_bytes values or NIC drivers delaying TX completion by hundred of usec, and very low rtt. Thomas Glanzmann for example reported an iscsi regression, caused by tcp auto corking making this bug quite visible. Fixes: 46d3ceab ("tcp: TCP Small Queues") Signed-off-by:
John Ogness <john.ogness@linutronix.de> Signed-off-by:
Eric Dumazet <edumazet@google.com> Reported-by:
Thomas Glanzmann <thomas@glanzmann.de> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Bjørn Mork authored
[ Upstream commit fbd3a77d ] This device was mentioned in an OpenWRT forum. Seems to have a "standard" Sierra Wireless ifnumber to function layout: 0: qcdm 2: nmea 3: modem 8: qmi 9: storage Signed-off-by:
Bjørn Mork <bjorn@mork.no> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sabrina Dubroca authored
[ Upstream commit 00fe11b3 ] Currently, to make netconsole start over IPv6, the source address needs to be specified. Without a source address, netpoll_parse_options assumes we're setting up over IPv4 and the destination IPv6 address is rejected. Check if the IP version has been forced by a source address before checking for a version mismatch when parsing the destination address. Signed-off-by:
Sabrina Dubroca <sd@queasysnail.net> Acked-by:
Cong Wang <cwang@twopensource.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-